1. Executive Summary
This document outlines the Day 1 analysis for the mynewsanalyzer iOS mobile application. The application's core objective is to aggregate the latest news based on user-defined topics, summarize the content using on-device Small Language Models (SLMs), and classify the news into user-defined groups (e.g., "For" and "Against").
Core Pillars of Analysis:
- News Aggregation & Crawling: Scalable ingestion strategies for global real-time news.
- On-device Summarization: Leveraging Apple Silicon for privacy-first AI summaries.
- On-device Classification: Zero-shot dynamic grouping using SLMs.
2. News Aggregation & Crawling
Recommended News APIs (2026)
NewsAPI.org / GNews.io
Excellent, developer-friendly JSON REST APIs. Granular filtering by keywords, language, and country. MVP Recommendation: Ease of integration and free developer tiers.
NewsData.io / Webz.io
Better suited for deep historical archives (7+ years) or direct sentiment analysis from the API source.
Crawling Fallback (ScrapingBee)
For sources not covered by APIs, ScrapingBee handles headless browsing, proxies, and JS rendering to prevent blocking during targeted scraping.
Data Ingestion Strategy
The backend polls APIs based on user topics, normalizes JSON payloads into a standard SQL schema, and serves them to iOS via a custom REST/GraphQL API.
3. News Summarization via SLMs
Running summarization on-device guarantees user privacy, reduces recurring cloud costs, and enables offline functionality.
Framework: Apple MLX
MLX (Apple's array framework) is the industry standard for Generative AI on iOS, leveraging the unified memory of Apple Silicon (A/M-series chips).
Optimal Models
- Llama 3.2 (8B Instruct): Meta's SLM via MLX Swift.
- Qwen2.5 (7B): Strong multilingual support.
- Gemma 2 (9B): Highly optimized for local execution.
Summarization Strategy
Use mlx-swift to load 4-bit quantized models. Raw article text is fed into the local model with the prompt: "Summarize the following news article in 3 bullet points."
4. News Classification
Approach 1: Zero-Shot Prompting
Reuses the generative SLM (Llama 3.2 / Qwen2.5) for dynamic categories.
Approach 2: Core ML Fallback
Train a lightweight MLTextClassifier on user-labeled datasets for battery-efficient, high-speed execution.
5. System Consistency & Constraints
Strict Cross-Component Impact Analysis:
Any change in one component necessitates a holistic system analysis and continuous compilation checks.
Next Steps: Day 2
Day 2 will focus on Data Design, mapping out the SQL structures and object models needed to support this architecture.