MDX Limo
SimpleNews Agent System

SimpleNews Agent System

A sophisticated multi-agent pipeline for autonomous AI news research, writing, and publication.


Table of Contents


Overview

SimpleNews employs a multi-agent orchestration system that autonomously discovers newsworthy AI topics, generates original articles, and publishes them with full editorial quality controls. The system leverages Claude's Agent SDK for intelligent research, structured output generation for consistent article quality, and semantic deduplication to prevent content overlap.

Key Capabilities

CapabilityImplementation
Autonomous ResearchClaude Agent SDK with 14 MCP tools
Multi-Platform DiscoveryX, HN, Reddit, GitHub, arXiv
Article GenerationClaude Sonnet with structured output
Link EnrichmentServer-side web search integration
Semantic Deduplicationpgvector with 90% similarity threshold
ISR PublishingNext.js incremental static regeneration

System Architecture

The system follows a sequential pipeline architecture with autonomous decision-making at the research phase.

Directory Structure

1agent/ 2├── src/ 3│ ├── run.ts # Main orchestration pipeline 4│ ├── research-agent.ts # Claude Agent SDK integration 5│ ├── writer.ts # Article generation 6│ ├── enrichment-agent.ts # Web search enrichment 7│ ├── publisher.ts # Supabase publishing 8│ ├── deduplication.ts # Semantic duplicate check 9│ ├── embeddings.ts # OpenAI embedding calls 10│ ├── scheduler.ts # Interval-based scheduling 11│ └── tools/ # MCP tool implementations 12│ ├── x-search.ts # X/Twitter API 13│ ├── hackernews.ts # HN search 14│ ├── reddit.ts # Reddit API 15│ ├── arxiv.ts # arXiv papers 16│ └── github-trending.ts

Agent Pipeline

The complete pipeline executes in 6 distinct phases, with comprehensive error handling and cost tracking at each stage.


Research Agent

The Research Agent is the system's autonomous discovery engine, powered by Claude's Agent SDK with Model Context Protocol (MCP) integration.

Architecture

PropertyValue
Modelclaude-sonnet-4-5-20250929
Max Turns40
Budget$1.50 per run
OutputStructured JSON with validation

Research Strategy

The agent follows a multi-phase discovery strategy that prioritizes grassroots sources before mainstream outlets:

MCP Tools

The research agent has access to 14 specialized tools:

Diversity Requirements

The agent enforces strict diversity rules to ensure balanced coverage:

RequirementMinimum
Indie/Community findings2
Non-X platform sources1
Categories represented2+
Max from same company3

Output Schema

1interface ResearchFinding { 2 topic: string 3 summary: string 4 detailed_context: string // Full briefing for writer 5 why_newsworthy: string 6 category: "models" | "companies" | "research" | "policy" | "tools" | "funding" 7 engagement_level: "viral" | "high" | "moderate" 8 source_type: "mainstream" | "indie" | "community" | "research" | "open_source" 9 source_platforms: string[] // ["x", "hackernews", "reddit", ...] 10 source_posts: { 11 url: string 12 author_handle: string 13 text_snippet: string 14 likes: number 15 retweets: number 16 }[] 17 suggested_tags: string[] 18}

Writer Agent

The Writer Agent transforms research findings into polished news articles with consistent structure and quality.

Processing Flow

Article Structure

Each generated article follows a consistent format:

  1. Opening Paragraph - News lead with key facts
  2. H2 Sections - Organized by topic
  3. Bullet Points - For lists and features
  4. Key Takeaways - 3-5 summarized points
  5. Citations - Structured source attribution

Output Schema

1interface ArticleDraft { 2 title: string 3 content: string // Markdown format 4 excerpt: string // 1-2 sentences 5 category: string 6 tags: string[] 7 key_takeaways: string[] // 3-5 items required 8 source_urls: string[] 9 citations: { 10 title: string 11 url: string 12 author?: string 13 publication?: string 14 publication_date?: string 15 }[] 16}

Enrichment Agent

The Enrichment Agent enhances articles with authoritative inline links and improved citations through web search.

Enrichment Process

Validation Rules

CheckRequirement
Word Count Change+25% max, -10% min
Link ProtocolHTTPS only
Anchor TextDescriptive (no "click here")
Content IntegrityTitle and key paragraphs preserved

Publishing Pipeline

The publishing pipeline ensures only unique, high-quality articles reach the database.

ISR Revalidation

After publishing, the system triggers Next.js Incremental Static Regeneration:

  • Home Page (/) - Updated article list
  • Article Page (/news/{slug}) - New article accessible

Data Models

Database Schema

E-E-A-T Fields

Google's E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) fields are embedded in the article schema:

FieldValue
author_name"SimpleNews AI"
author_credentials"AI Research Assistant"
methodology"AI-generated from verified sources"
citationsStructured JSON array
fact_check_statuspending | verified | flagged | failed

External Integrations

API Costs

ServiceCost
Claude API3/Minput,3/M input, 15/M output
X API0.005/post,0.005/post, 0.01/user lookup
Web Search$0.01/search
OpenAI Embeddings$0.00002/1K tokens
Hacker NewsFree
RedditFree
arXivFree
GitHubFree (rate limited)

Cost Optimization

The system employs multiple strategies to minimize API costs:

Optimization Techniques

  1. Caching - X API responses cached for 30 minutes
  2. Free Tool Priority - hn_front_page, reddit_rising, github_trending called first
  3. Broad Queries - Single x_indie_discovery call with OR query vs. multiple calls
  4. Batch Processing - 3 articles per Claude API call
  5. Search Limits - Max 5 web searches per article enrichment
  6. Budget Enforcement - $1.50 hard limit per research run

Quality Assurance

Multi-Layer Quality Controls

Error Handling

ComponentStrategy
Research AgentCredit balance retry (3 failures = abort)
Writer Agent3 retries with exponential backoff
Enrichment AgentGraceful degradation (return original)
PublisherContinue with remaining articles on failure

Summary

The SimpleNews Agent System demonstrates a production-grade approach to AI-powered content generation:

  • Autonomous Discovery - Claude Agent SDK enables intelligent tool selection
  • Multi-Platform Coverage - 7+ discovery sources with diversity enforcement
  • Quality Assurance - Structured outputs, validation layers, semantic dedup
  • Cost Efficiency - Caching, free tool priority, budget caps
  • Full Transparency - Cost tracking, audit trails, source attribution

The system produces 5-10 original news articles per run, each with proper citations, enriched links, and E-E-A-T compliance for SEO best practices.

SimpleNews Agent System | MDX Limo