Alton Wells — AI Content Engine: Architecture & Scope

Project Codename: ContentEngine
Framework: Mastra (TypeScript)
Status: Architecture Review Draft
Date: March 2026

1. System Overview

This document defines the architecture, data model, agent design, workflow orchestration, and scope for an autonomous AI-powered content production system. The system operates across three layers — Strategy, Content, and Production — mirroring the flow diagram, with a fourth Data Infrastructure layer underpinning everything.

The core loop is:

Competitors + Market Data + Our Performance + Our Strategy
        ↓
   AI generates a Content Plan (calendar)
        ↓
   Human reviews/adjusts the plan
        ↓
   AI autonomously executes each piece through the pipeline
        ↓
   Human reviews final output
        ↓
   Programmatic SEO validation (must pass 10/10)
        ↓
   Publish

The system is designed so that humans set strategy and approve output, while AI handles everything in between — research, planning, writing, editing, optimization, image generation, and SEO validation.

2. Data Infrastructure Layer

This is the foundation. Every agent in the system reads from and writes to these data stores. Without clean, continuously-updated data, the agents cannot make good decisions.

2.1 Competitor Intelligence Database

Purpose: A living, continuously-updated database of every competitor — their domains, sitemaps, page inventory, content topics, publishing frequency, and content changes over time.

Schema (core tables):

Table	Key Fields	Update Frequency
`competitors`	id, name, domain, industry_vertical, notes	Manual + agent-suggested
`competitor_sitemaps`	id, competitor_id, sitemap_url, last_crawled_at, page_count	Daily crawl
`competitor_pages`	id, competitor_id, url, title, meta_description, h1, h2s[], word_count, published_at, last_modified, content_hash	Daily crawl
`competitor_content_analysis`	id, page_id, topics[], keywords[], content_type, estimated_traffic, serp_position, quality_score	Weekly analysis
`competitor_changes`	id, page_id, change_type (new/updated/removed), detected_at, diff_summary	Daily diff

Data Collection Agents/Jobs:

Job	Schedule	What It Does
Sitemap Crawler	Daily, 2 AM	Fetches and parses all competitor sitemaps. Detects new pages, removed pages, and last-modified changes. Stores raw sitemap XML and parsed entries.
Page Scraper	Daily, 3 AM	For new/changed pages, fetches the page, extracts title, meta, headings, word count, content body. Computes content hash for change detection.
Content Analyzer	Weekly, Sunday	Runs an LLM analysis pass over new/changed competitor pages. Extracts topics, identifies content type (blog, guide, landing page, comparison), estimates quality.
Competitor Diff Reporter	Daily, 6 AM	Generates a digest of all competitor content changes in the last 24 hours. Posts to Slack/dashboard. Flags high-priority changes (new pages targeting our keywords).

Key Design Decision: Store the full text content of competitor pages (not just metadata). This feeds the Strategy Agent's ability to analyze positioning, messaging gaps, and content depth. Store as plain text extracted from HTML, not raw HTML.

2.2 Search & Keyword Performance Database

Purpose: Centralized store of keyword research data, SERP landscape, search trends, and generative search (AI Overview) appearances.

Schema (core tables):

Table	Key Fields	Update Frequency
`keywords`	id, keyword, search_volume, difficulty, cpc, intent (informational/transactional/etc), cluster_id	Weekly refresh
`keyword_clusters`	id, name, primary_keyword_id, topic, priority	Manual + AI-suggested
`serp_snapshots`	id, keyword_id, snapshot_date, organic_results[], featured_snippet, people_also_ask[], ai_overview_present, ai_overview_sources[]	Weekly
`ai_overview_tracking`	id, keyword_id, detected_at, our_site_cited (bool), cited_sources[], summary_text	Daily for priority keywords
`search_trends`	id, keyword_id, date, volume_index, yoy_change	Monthly

Data Sources & Integrations:

Source	What It Provides	Integration Method
Semrush / Ahrefs API	Search volume, difficulty, CPC, SERP features, competitor rankings	REST API via Mastra tool
Google Search Console API	Our impressions, clicks, CTR, average position per query	OAuth2 API via Mastra tool
Google Trends API (unofficial)	Relative search interest over time	Scraping or unofficial API
SERP Scraper	Full SERP snapshots including AI Overview detection	Custom tool (SerpAPI or Browserbase)
Generative search monitors	AI Overview presence and citation tracking	Custom OpenClaw cron or Mastra scheduled workflow

2.3 Our Content & Performance Database

Purpose: Complete inventory of all our published content, its structure, performance metrics, and metadata. This is what the agents use to understand "what we have" and "how it's performing."

Schema (core tables):

Table	Key Fields	Update Frequency
`our_pages`	id, url, title, slug, content_type, status (published/draft/planned), published_at, last_updated, word_count, content_body	On publish/update
`our_page_seo`	id, page_id, meta_title, meta_description, h1, h2s[], canonical_url, schema_markup, internal_links_out[], internal_links_in[], seo_score	On publish + weekly audit
`our_page_performance`	id, page_id, date, impressions, clicks, ctr, avg_position, sessions, bounce_rate, avg_time_on_page	Daily from GSC + Analytics
`our_page_keywords`	id, page_id, keyword_id, target (bool), current_position, position_change_7d, position_change_30d	Daily
`content_embeddings`	id, page_id, chunk_index, embedding_vector, chunk_text	On publish/update

Key Design Decision: Maintain a vector embedding index of all our content (using Mastra's RAG pipeline). This powers internal linking automation, content gap detection, and duplicate/cannibalization detection.

2.4 Content Strategy & Planning Database

Purpose: The content calendar, strategy directives, brand voice guidelines, and editorial configuration that humans set and the AI operates against.

Schema (core tables):

Table	Key Fields	Update Frequency
`content_strategy`	id, name, description, target_audience, brand_voice_guidelines, content_pillars[], priorities, active (bool)	Manual (human-set)
`content_plan_items`	id, strategy_id, title, target_keyword_id, content_type, status (planned/in_progress/review/published), scheduled_date, assigned_agent_run_id, priority, notes, source (ai_generated/human_added)	AI-generated + human-edited
`content_briefs`	id, plan_item_id, outline, target_word_count, target_keywords[], competitor_references[], internal_link_targets[], research_notes, approved (bool)	AI-generated, human-approved
`content_drafts`	id, plan_item_id, version, content_body, seo_score, editor_notes, status (draft/edited/approved/rejected)	AI-generated per pipeline run
`brand_voice_samples`	id, strategy_id, sample_text, tone_tags[], notes	Manual upload

Key Design Decision: The content_plan_items table has a source field — either ai_generated or human_added. Both the AI planning agent and human users can add items to the calendar. The AI respects human-added items as fixed constraints when generating its plans.

3. Strategy Layer — Agents & Workflows

This layer corresponds to the top section of your flow diagram. The agents here consume all the data infrastructure, synthesize it, and produce actionable content plans.

3.1 Competitive Intelligence Agent

ID: competitive-intelligence-agent
Model: anthropic/claude-sonnet-4-20250514

Role: Continuously analyzes the competitor database and produces strategic insights about competitor positioning, content gaps, and opportunities.

Inputs:

Competitor pages database (new and changed content)
Our content database (for comparison)
Keyword performance data

Tools:

query_competitor_pages — Searches competitor content by topic, keyword, or content type
query_our_pages — Searches our content for comparison
competitor_diff_summary — Gets recent competitor content changes
web_search — Validates findings against live web data

Outputs (structured, Zod-validated):

{
  competitorMoves: [{
    competitor: string,
    action: "new_content" | "content_update" | "new_topic",
    details: string,
    relevanceToUs: "high" | "medium" | "low",
    suggestedResponse: string,
  }],
  contentGaps: [{
    topic: string,
    competitorsCovering: string[],
    ourCoverage: "none" | "weak" | "adequate",
    opportunity: string,
    estimatedImpact: "high" | "medium" | "low",
  }],
  positioningInsights: string,
}

Schedule: Weekly (full analysis), daily (change digest only).

3.2 Search Landscape Agent

ID: search-landscape-agent
Model: anthropic/claude-sonnet-4-20250514

Role: Monitors keyword performance, SERP changes, AI Overview appearances, and search trends. Identifies where we're winning, losing, and where new opportunities are emerging.

Inputs:

Keyword + SERP snapshot database
Our page performance data (GSC)
AI Overview tracking data
Search trend data

Tools:

query_keyword_performance — Pulls our ranking data for specific keywords
query_serp_snapshots — Gets SERP landscape for keywords
query_ai_overview_tracking — Checks AI Overview presence and our citation status
semrush_keyword_research — Fetches fresh keyword data from Semrush API
gsc_performance_query — Pulls real-time data from Google Search Console

Outputs (structured):

{
  rankingChanges: [{ keyword, previousPosition, currentPosition, trend, url }],
  aiOverviewAlerts: [{ keyword, ourSiteCited, topCitedSources, recommendation }],
  emergingKeywords: [{ keyword, volume, difficulty, relevance, opportunity }],
  decliningContent: [{ url, keyword, positionDrop, suggestedAction }],
  searchTrendShifts: [{ topic, direction, magnitude, implication }],
}

Schedule: Daily for ranking changes, weekly for full landscape analysis.

3.3 Content Strategy Agent (The Planner)

ID: content-strategy-agent
Model: anthropic/claude-sonnet-4-20250514

Role: The brain of the Strategy Layer. Synthesizes outputs from the Competitive Intelligence Agent, Search Landscape Agent, our content inventory, our strategy directives, and brand guidelines to produce a prioritized content plan.

This is the agent that "thinks through all of these and comes up with a plan" from your diagram.

Inputs:

Competitive Intelligence Agent output
Search Landscape Agent output
Current content strategy (human-set pillars, priorities, brand voice)
Existing content plan items (both AI-generated and human-added)
Our content inventory + performance data
Content embeddings (to avoid duplication)

Tools:

query_content_strategy — Gets current strategy directives
query_content_plan — Gets existing planned/scheduled items
query_our_content_inventory — Searches our published content
vector_similarity_search — Checks if proposed topics overlap with existing content
web_search — Researches topics for viability and resource discovery
add_content_plan_item — Writes new items to the content calendar
update_content_plan_item — Modifies existing plan items

Workflow:

Load current strategy directives and priorities
Ingest Competitive Intelligence report
Ingest Search Landscape report
Review existing content plan (what's already scheduled)
Review our content inventory (what we've already published)
Identify gaps between strategy goals and current coverage
Generate candidate content ideas with rationale
Score and prioritize candidates by:
   - Strategic alignment (does it serve our pillars?)
   - Search opportunity (volume × 1/difficulty)
   - Competitive urgency (are competitors gaining ground?)
   - Content gap severity (do we have zero coverage?)
   - GEO potential (can this be cited by AI search?)
Check for duplication/cannibalization via vector similarity
Research top candidates (web search for viability, resource links)
Produce prioritized content plan with scheduling recommendations
Write plan items to database → SUSPEND for human review

Outputs:

{
  planItems: [{
    title: string,
    targetKeyword: string,
    contentType: "blog_post" | "guide" | "landing_page" | "comparison" | "case_study",
    rationale: string,           // Why this piece, why now
    competitiveContext: string,   // What competitors are doing
    suggestedScheduleDate: Date,
    priority: 1 | 2 | 3,
    estimatedImpact: string,
    researchLinks: string[],     // Pre-found resources to link/reference
    internalLinkTargets: string[], // Our existing pages to link to/from
  }],
  strategyNotes: string,         // Agent's high-level strategic thinking
  calendarSummary: string,       // Overview of the proposed schedule
}

Human Interaction Point: After the agent generates the plan, the workflow suspends (Mastra .waitForEvent()). The human reviews the proposed calendar in the app UI, can approve items, reject items, edit items, add their own items, and reorder priorities. When they click "Approve Plan," the workflow resumes and the approved items are queued for execution.

3.4 Content Brief Agent

ID: content-brief-agent
Model: anthropic/claude-sonnet-4-20250514

Role: For each approved content plan item, generates a detailed content brief that the Writer Agent will execute against.

Inputs:

Approved content plan item (from calendar)
Competitor content analysis (what top-ranking pages cover)
Keyword data (primary, secondary, related terms)
Our existing content (for internal linking opportunities)
Brand voice guidelines

Tools:

query_serp_snapshots — Gets current SERP landscape for the target keyword
scrape_competitor_page — Extracts structure and content from top-ranking competitor pages
query_our_content_inventory — Finds internal linking opportunities
web_search — Finds additional resources, data sources, expert quotes
vector_similarity_search — Ensures the brief doesn't duplicate existing content

Outputs (structured):

{
  title: string,
  targetKeyword: string,
  secondaryKeywords: string[],
  searchIntent: string,
  targetWordCount: number,
  contentFormat: string,              // "How-to guide", "Listicle", "Deep dive", etc.
  outline: [{
    heading: string,
    level: "h2" | "h3",
    keyPoints: string[],
    targetKeywords: string[],         // Keywords to naturally include in this section
    suggestedWordCount: number,
  }],
  competitorAnalysis: string,         // What top 5 do well, where they fall short
  differentiators: string[],          // Our unique angles
  internalLinkTargets: [{
    url: string,
    anchorTextSuggestion: string,
    contextNote: string,
  }],
  externalResources: [{
    url: string,
    description: string,
    useCase: string,                  // "Cite as source", "Link for reader", "Reference for accuracy"
  }],
  toneAndStyle: string,
  audienceNotes: string,
  seoRequirements: {
    metaTitleGuideline: string,
    metaDescriptionGuideline: string,
    schemaType: string,
    featuredSnippetTarget: boolean,
  },
}

4. Content Layer — Agents & Workflows

This layer corresponds to the middle section of your flow diagram. It takes an approved content brief and produces a polished draft.

4.1 Writer Agent

ID: writer-agent
Model: anthropic/claude-sonnet-4-20250514
Instructions: Dynamic — loads brand voice guidelines + brief-specific tone directives

Role: Receives a content brief and produces a complete first draft that follows the outline, hits the word count targets, incorporates keywords naturally, includes internal and external links, and matches the brand voice.

Key Prompting Strategy:

The Writer Agent's system prompt is assembled dynamically for each piece:

Base Instructions (static):
  - Writing quality standards
  - Formatting rules (heading hierarchy, paragraph length, etc.)
  - Link insertion patterns
  - Keyword integration rules (natural, not stuffed)

+ Brand Voice Guidelines (from content_strategy table):
  - Tone descriptors
  - Sample passages
  - Vocabulary preferences / words to avoid

+ Brief-Specific Directives (from content_briefs table):
  - The full outline with section-level instructions
  - Target keywords per section
  - Differentiators to emphasize
  - Internal/external links to include

Tools:

web_search — For real-time fact verification during writing
query_our_content — To check consistency with existing published content
vector_similarity_search — To find additional internal linking opportunities during writing

Output: Full markdown content body with frontmatter metadata, internal links, external links, and image placement markers ([IMAGE: description of needed image]).

4.2 Editor Agent

ID: editor-agent
Model: anthropic/claude-sonnet-4-20250514
Instructions: Focused on language correctness, verbal consistency, and brand voice adherence

Role: Reviews the Writer Agent's draft for language correctness, verbal consistency, factual accuracy, brand voice adherence, and structural quality. Does NOT rewrite — provides specific edits and corrections.

Checks performed:

Check Category	What It Evaluates
Language Correctness	Grammar, spelling, punctuation, sentence structure
Verbal Consistency	Consistent terminology (don't switch between "users" and "customers" randomly), consistent formatting of terms, consistent voice (active vs. passive)
Brand Voice Adherence	Compares against brand voice samples. Flags sections that drift from established tone
Factual Consistency	Cross-references claims with the brief's source material. Flags unsubstantiated statistics
Structural Quality	Heading hierarchy compliance, section length balance, transition quality, intro/conclusion effectiveness
Link Quality	Verifies all internal links point to real pages, checks anchor text naturalness, validates external link relevance
Keyword Integration	Confirms target keywords appear in required positions (title, H1, first paragraph, H2s) without over-optimization

Output:

{
  overallAssessment: "pass" | "needs_revision",
  editsList: [{
    location: string,        // Section/paragraph identifier
    type: "grammar" | "voice" | "factual" | "structural" | "keyword" | "link",
    severity: "critical" | "suggested",
    original: string,
    suggested: string,
    rationale: string,
  }],
  voiceConsistencyScore: number,   // 0-100
  readabilityScore: number,        // Flesch-Kincaid or similar
  revisedContent: string,          // Full draft with all edits applied
}

Workflow Logic: If overallAssessment === "needs_revision" and critical edits exist, the revised content loops back through the Writer Agent with the edit list as context. Maximum 2 revision cycles, then force-proceed to human review.

4.3 Content Layer Workflow (Mastra)

const contentLayerWorkflow = createWorkflow({
  id: "content-layer-pipeline",
  inputSchema: z.object({ briefId: z.string() }),
  outputSchema: z.object({ draftId: z.string(), seoScore: z.number() }),
})
  .then(loadBriefStep)              // Load the approved content brief
  .then(writerAgentStep)            // Writer produces first draft
  .then(editorAgentStep)            // Editor reviews and revises
  .branch({                         // Conditional: needs another pass?
    condition: ({ editorOutput }) => 
      editorOutput.overallAssessment === "needs_revision" 
      && editorOutput.revisionCount < 2,
    trueStep: writerRevisionStep,   // Loop back to writer with edits
    falseStep: finalizeDraftStep,   // Proceed to final draft
  })
  .then(saveDraftStep)              // Persist final draft to database
  .commit();

5. Production Layer — Agents, Checks & Publishing

This layer corresponds to the bottom section of your flow diagram. It takes the polished draft through human review, image generation, final edits, programmatic SEO validation, and publishing.

5.1 Human Review (Suspend/Resume)

This is the critical quality gate.

When the Content Layer produces a final draft:

Draft is saved to content_drafts with status awaiting_review
Notification sent (Slack, email, or in-app)
Workflow suspends via .waitForEvent("human-review-complete")
Human opens the draft in the app UI
Human can:
- Approve — proceed as-is
- Edit — make changes directly, then approve
- Reject — send back to Content Layer with notes (triggers a new Writer Agent run with human feedback as context)
- Add personal experience/insights — the critical 20% from the 80/20 method
On approval, workflow resumes with the human-edited content

UI Requirements:

Side-by-side view: AI draft vs. content brief
Inline editing with change tracking
Comment/annotation capability
SEO score preview (live-updating as human edits)
One-click approve/reject buttons

5.2 Image Generation Agent

ID: image-generation-agent
Model: Image generation API (DALL-E 3, Midjourney API, or Flux)

Role: Processes the [IMAGE: description] markers in the content and generates appropriate images.

Workflow:

Parse all image markers from the approved content
For each marker, generate a detailed image prompt based on the description and content context
Generate the image via API
Optimize the image (compress, resize to target dimensions)
Generate SEO-optimized alt text
Upload to CDN / media library
Replace markers in content with proper <img> tags including alt text, width/height, and loading="lazy"

Output: Updated content body with all image markers replaced by actual image references + alt text.

5.3 Final Edits Agent

ID: final-edits-agent
Model: anthropic/claude-sonnet-4-20250514

Role: One last pass after human edits and image insertion. Ensures human edits didn't break formatting, images are properly placed, links still work, and the content is publication-ready.

Checks:

Markdown/HTML formatting validity
Image placement and alt text quality
Link integrity (no broken internal links)
Consistent formatting after human edits
Final readability pass

5.4 Programmatic SEO Validation (Must Score 10/10)

This is not an LLM agent — it's a deterministic, code-based validation engine. Every check has a binary pass/fail result. All 10 must pass.

const seoChecks = {
  1: {
    name: "Meta Title",
    check: (content) => {
      // Length: 50-60 characters
      // Contains primary keyword
      // Unique (not used by any other page)
      // No truncation risk
    }
  },
  2: {
    name: "Meta Description", 
    check: (content) => {
      // Length: 150-160 characters
      // Contains primary keyword
      // Includes call-to-action or value proposition
      // Unique across site
    }
  },
  3: {
    name: "Heading Hierarchy",
    check: (content) => {
      // Exactly one H1
      // H1 contains primary keyword
      // H2s use secondary keywords
      // No skipped levels (H1 → H3 without H2)
      // Logical nesting
    }
  },
  4: {
    name: "Keyword Optimization",
    check: (content) => {
      // Primary keyword in: title, H1, first 100 words, at least one H2, meta description
      // Keyword density: 0.5% - 2.5% (not over-optimized)
      // Secondary keywords present naturally
      // No keyword stuffing patterns detected
    }
  },
  5: {
    name: "Internal Linking",
    check: (content) => {
      // Minimum 3 internal links
      // All internal links point to valid, published pages
      // Anchor text is descriptive (no "click here")
      // Anchor text is diversified (not all exact-match keyword)
      // Links are contextually relevant
    }
  },
  6: {
    name: "External Linking",
    check: (content) => {
      // At least 1 external link to authoritative source
      // External links use rel="noopener" on new-tab links
      // No links to competitor domains (configurable blocklist)
      // External links are contextually relevant
    }
  },
  7: {
    name: "Content Quality Metrics",
    check: (content) => {
      // Word count meets target (within 10% of brief target)
      // Readability score within acceptable range (configurable)
      // No duplicate content detected (cosine similarity < threshold vs. existing pages)
      // Paragraph length: no paragraphs over 300 words
      // Sentence variety: mix of short and long sentences
    }
  },
  8: {
    name: "Technical SEO",
    check: (content) => {
      // Valid schema markup (JSON-LD) present and parseable
      // Canonical URL set correctly
      // Open Graph tags present (og:title, og:description, og:image)
      // Twitter Card tags present
      // Image alt text on all images
      // Image dimensions specified
    }
  },
  9: {
    name: "URL & Slug",
    check: (content) => {
      // Slug is URL-friendly (lowercase, hyphens, no special chars)
      // Slug contains primary keyword or close variant
      // Slug length: under 60 characters
      // No duplicate slug in our pages database
    }
  },
  10: {
    name: "Mobile & Performance",
    check: (content) => {
      // All images have width/height (prevents CLS)
      // Images use lazy loading
      // No inline styles that break mobile
      // Table responsiveness handled
      // No excessively large embedded content
    }
  }
};

Validation Flow:

Run all 10 checks against the content
If all pass → proceed to publish
If any fail → generate a fix report, send back to Final Edits Agent with specific failure details, re-validate after fixes
Maximum 3 fix cycles, then escalate to human with the failure report

Output:

{
  score: "10/10" | "9/10" | etc,
  passed: boolean,
  checks: [{
    id: number,
    name: string,
    passed: boolean,
    details: string,       // What was checked
    failureReason?: string, // Why it failed (if applicable)
    autoFixable: boolean,   // Can the agent fix this automatically?
  }],
}

5.5 Publishing Agent

ID: publishing-agent
Model: N/A (primarily tool-driven, minimal LLM reasoning)

Role: Takes the validated, 10/10 content and publishes it to the CMS.

Actions:

Format content for CMS (markdown → CMS content blocks, or HTML)
Upload images to CMS media library (if not already CDN-hosted)
Set all metadata (title, description, slug, canonical, schema, OG tags)
Set publication date (from content plan schedule)
Create/update XML sitemap entry
Ping Google Indexing API / IndexNow for fast crawling
Update our content database (our_pages, our_page_seo)
Generate content embeddings and add to vector index
Run bidirectional internal linking (find places in existing content to link to the new page)
Schedule social media distribution (if configured)
Log publish event to audit trail

Post-Publish Monitoring:

24-hour check: Verify page is indexed (Google Search Console)
7-day check: Initial ranking data and impressions
30-day check: Performance review against projected targets
Auto-flag underperforming content for refresh consideration

6. Master Workflow Orchestration

The entire system is orchestrated as a Mastra workflow that chains the three layers:

┌─────────────────────────────────────────────────────────────────┐
│                     SCHEDULED TRIGGERS                          │
│                                                                 │
│  Daily: Competitor crawl, SERP monitoring, rank tracking        │
│  Weekly: Full competitive analysis, search landscape report     │
│  On-demand: Human triggers plan generation                      │
│  On-schedule: Content calendar items trigger execution           │
└─────────────────────────┬───────────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────────┐
│                    STRATEGY LAYER WORKFLOW                       │
│                                                                 │
│  1. Competitive Intelligence Agent (async, data collection)     │
│  2. Search Landscape Agent (async, data collection)             │
│  3. Content Strategy Agent (synthesis + plan generation)        │
│  4. ── SUSPEND ── Human reviews/edits content plan ── RESUME ── │
│  5. Content Brief Agent (generates brief per approved item)     │
│  6. ── SUSPEND ── Human approves brief ── RESUME ──             │
└─────────────────────────┬───────────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────────┐
│                    CONTENT LAYER WORKFLOW                        │
│                                                                 │
│  7. Writer Agent (produces first draft from brief)              │
│  8. Editor Agent (reviews, revises)                             │
│  9. ── LOOP ── if needs_revision && cycles < 2 → back to 7 ──  │
│  10. Final draft saved to database                              │
└─────────────────────────┬───────────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────────┐
│                   PRODUCTION LAYER WORKFLOW                      │
│                                                                 │
│  11. ── SUSPEND ── Human reviews draft (edits, adds            │
│         experience, approves) ── RESUME ──                      │
│  12. Image Generation Agent                                     │
│  13. Final Edits Agent (post-human cleanup)                     │
│  14. Programmatic SEO Validation (10/10 required)               │
│  15. ── LOOP ── if score < 10/10 → auto-fix → revalidate ──   │
│  16. Publishing Agent (push to CMS, index, distribute)          │
│  17. Post-publish monitoring scheduled                          │
└─────────────────────────────────────────────────────────────────┘

Suspension Points (Human Touchpoints)

#	Where	What Human Does	Estimated Time
1	After content plan generation	Review calendar, approve/reject/edit items, add own items	15-30 min per planning cycle
2	After content brief generation	Review brief, confirm outline and direction	5-10 min per brief
3	After final draft produced	Deep review, add personal insights, edit for voice, approve	10-20 min per piece

Total human time per content piece: ~30-60 minutes (compared to 4-8 hours for fully manual content creation).

7. Application UI Requirements

The system needs a web application (likely Next.js given Mastra's TypeScript ecosystem) with these core views:

7.1 Dashboard

Content pipeline status (items in each stage)
Today's scheduled publications
Competitor change alerts (last 24h)
Ranking changes (significant movers)
AI Overview citation tracking
Content performance summary

7.2 Competitor Monitor

Competitor list with domain, page count, last crawled
New/changed page feed (chronological)
Per-competitor content analysis (topics, volume, quality)
Side-by-side content comparison (their page vs. ours on same topic)

7.3 Keyword & Search Performance

Keyword tracker with rankings over time
SERP feature tracking (featured snippets, AI Overviews)
Search volume trends
Keyword cluster view
GSC data integration (impressions, clicks, CTR)

7.4 Content Calendar

Calendar view of planned/scheduled content
Drag-and-drop rescheduling
Status indicators (planned → in_progress → review → published)
AI-generated items vs. human-added items (visually distinguished)
One-click to view brief, draft, or published piece
"Generate Plan" button that triggers the Strategy Agent

7.5 Content Editor / Review Interface

Full content preview
Side-by-side: brief vs. draft
Inline editing with change tracking
SEO score panel (live-updating)
Comment/annotation system
Approve / Request Changes / Reject buttons
Image preview and alt text editing

7.6 SEO Audit View

10-point SEO check results for each piece
Historical SEO scores across all content
Site-wide SEO health metrics
Internal linking map visualization
Technical SEO issue tracker

7.7 Strategy Settings

Brand voice configuration (samples, tone descriptors)
Content pillars and priorities
Competitor list management
Target keyword management
Publishing workflow configuration (which checks are required, auto-publish vs. manual)

8. Technology Stack

Component	Technology	Rationale
Agent Framework	Mastra (`@mastra/core`)	TypeScript-native, workflow orchestration, RAG, tools, suspend/resume
Runtime	Node.js 20+ / Bun	Mastra's supported runtimes
Web Framework	Next.js 15+	Mastra ecosystem alignment, SSR, API routes
Database	PostgreSQL + pgvector	Relational data + vector embeddings in one database
ORM	Drizzle ORM	TypeScript-native, great Postgres support
Vector Search	pgvector (via Mastra RAG)	Unified with primary database, no separate vector DB needed
LLM Provider	Anthropic Claude Sonnet 4 (primary)	Quality + cost balance for content generation
Image Generation	DALL-E 3 or Flux API	High quality, API-accessible
SEO Data API	Semrush or Ahrefs API	Keyword data, SERP snapshots, competitor data
Search Console	Google Search Console API	Our ranking/performance data
CMS Integration	WordPress REST API, Sanity, or Contentful	Depends on existing CMS — tool adapter pattern
Job Scheduling	Trigger.dev or Mastra cron	Durable execution for scheduled jobs
Notifications	Slack API + email	Human review notifications
Hosting	Vercel (app) + Railway/Render (workers)	Serverless for app, persistent processes for agents
Monitoring	Mastra Studio + custom observability	Agent tracing, workflow debugging

9. Build Phases

Phase 1: Foundation (Weeks 1-3)

PostgreSQL schema setup (all tables defined in Section 2)
Mastra project scaffolding with agent/tool/workflow structure
Basic Next.js app shell with authentication
Google Search Console API integration (tool)
Semrush/Ahrefs API integration (tool)
Competitor sitemap crawler (scheduled job)
Competitor page scraper (scheduled job)
Our content inventory ingestion + embedding pipeline

Phase 2: Strategy Layer (Weeks 4-6)

Competitive Intelligence Agent
Search Landscape Agent
Content Strategy Agent (plan generation)
Content Brief Agent
Content Calendar UI (view, edit, approve)
Suspend/resume workflow for plan approval
Dashboard v1 (basic metrics)

Phase 3: Content Layer (Weeks 7-9)

Writer Agent with dynamic prompt assembly
Editor Agent with revision loop
Content Layer workflow with branching
Content Editor / Review UI
Brand voice configuration UI
Suspend/resume workflow for human review

Phase 4: Production Layer (Weeks 10-12)

Phase 5: Polish & Automation (Weeks 13-15)

Full end-to-end workflow testing
Internal linking automation (bidirectional)
AI Overview monitoring
Competitor change alert system
Performance dashboards
Prompt optimization based on output quality data
Documentation and runbooks

Phase 6: Optimization (Ongoing)

Mastra eval framework integration (measure content quality over time)
A/B test different agent prompts and models
Cost optimization (model routing based on task complexity)
Scale testing (50+ pieces/month throughput)
Content refresh pipeline (automated identification and updating of stale content)

10. Cost Estimation (Monthly, at Scale)

Cost Category	Estimate (50 pieces/month)	Notes
LLM API (Claude Sonnet)	$200-400	~$4-8 per piece across all agents
SEO Data API (Semrush)	$119-229	Business plan for API access
Image Generation	$50-100	~$1-2 per piece for 2-3 images each
Hosting (Vercel + Railway)	$50-100	App + background workers
PostgreSQL (managed)	$25-50	Neon, Supabase, or Railway Postgres
Google Search Console	Free	API access included
Total	~$450-880/month

This compares favorably to manual content production costs of $200-500+ per piece at the same quality level.

11. Risks & Mitigations

Risk	Likelihood	Impact	Mitigation
LLM output quality inconsistency	High	Medium	Multi-agent review pipeline + human gate + programmatic checks
Google algorithm update targeting AI content	Medium	High	80/20 human-AI method ensures genuine expertise in every piece
API cost overruns at scale	Medium	Medium	Token budget per piece, model routing (cheaper models for simple tasks), caching
Competitor data accuracy (scraping failures)	Medium	Low	Fallback to API data, alerting on crawl failures, manual override
Hallucination in published content	Medium	High	Fact-check agent + human review + RAG grounding + source citation requirements
Over-optimization (content feels robotic)	Medium	Medium	Brand voice samples, editor agent voice consistency checks, human tone review
Workflow complexity / debugging difficulty	Medium	Medium	Mastra Studio tracing, comprehensive logging, workflow versioning
CMS integration fragility	Low	Medium	Adapter pattern — swap CMS connectors without changing pipeline logic

This document is a living specification. It should be updated as architectural decisions are made during implementation and as the system evolves through testing and production use.