MDX Limo
Alton Wells — AI Content Engine: Architecture & Scope

Alton Wells — AI Content Engine: Architecture & Scope

Project Codename: ContentEngine
Framework: Mastra (TypeScript)
Status: Architecture Review Draft
Date: March 2026


1. System Overview

This document defines the architecture, data model, agent design, workflow orchestration, and scope for an autonomous AI-powered content production system. The system operates across three layers — Strategy, Content, and Production — mirroring the flow diagram, with a fourth Data Infrastructure layer underpinning everything.

The core loop is:

1Competitors + Market Data + Our Performance + Our Strategy 23 AI generates a Content Plan (calendar) 45 Human reviews/adjusts the plan 67 AI autonomously executes each piece through the pipeline 89 Human reviews final output 1011 Programmatic SEO validation (must pass 10/10) 1213 Publish

The system is designed so that humans set strategy and approve output, while AI handles everything in between — research, planning, writing, editing, optimization, image generation, and SEO validation.


2. Data Infrastructure Layer

This is the foundation. Every agent in the system reads from and writes to these data stores. Without clean, continuously-updated data, the agents cannot make good decisions.

2.1 Competitor Intelligence Database

Purpose: A living, continuously-updated database of every competitor — their domains, sitemaps, page inventory, content topics, publishing frequency, and content changes over time.

Schema (core tables):

TableKey FieldsUpdate Frequency
competitorsid, name, domain, industry_vertical, notesManual + agent-suggested
competitor_sitemapsid, competitor_id, sitemap_url, last_crawled_at, page_countDaily crawl
competitor_pagesid, competitor_id, url, title, meta_description, h1, h2s[], word_count, published_at, last_modified, content_hashDaily crawl
competitor_content_analysisid, page_id, topics[], keywords[], content_type, estimated_traffic, serp_position, quality_scoreWeekly analysis
competitor_changesid, page_id, change_type (new/updated/removed), detected_at, diff_summaryDaily diff

Data Collection Agents/Jobs:

JobScheduleWhat It Does
Sitemap CrawlerDaily, 2 AMFetches and parses all competitor sitemaps. Detects new pages, removed pages, and last-modified changes. Stores raw sitemap XML and parsed entries.
Page ScraperDaily, 3 AMFor new/changed pages, fetches the page, extracts title, meta, headings, word count, content body. Computes content hash for change detection.
Content AnalyzerWeekly, SundayRuns an LLM analysis pass over new/changed competitor pages. Extracts topics, identifies content type (blog, guide, landing page, comparison), estimates quality.
Competitor Diff ReporterDaily, 6 AMGenerates a digest of all competitor content changes in the last 24 hours. Posts to Slack/dashboard. Flags high-priority changes (new pages targeting our keywords).

Key Design Decision: Store the full text content of competitor pages (not just metadata). This feeds the Strategy Agent's ability to analyze positioning, messaging gaps, and content depth. Store as plain text extracted from HTML, not raw HTML.

2.2 Search & Keyword Performance Database

Purpose: Centralized store of keyword research data, SERP landscape, search trends, and generative search (AI Overview) appearances.

Schema (core tables):

TableKey FieldsUpdate Frequency
keywordsid, keyword, search_volume, difficulty, cpc, intent (informational/transactional/etc), cluster_idWeekly refresh
keyword_clustersid, name, primary_keyword_id, topic, priorityManual + AI-suggested
serp_snapshotsid, keyword_id, snapshot_date, organic_results[], featured_snippet, people_also_ask[], ai_overview_present, ai_overview_sources[]Weekly
ai_overview_trackingid, keyword_id, detected_at, our_site_cited (bool), cited_sources[], summary_textDaily for priority keywords
search_trendsid, keyword_id, date, volume_index, yoy_changeMonthly

Data Sources & Integrations:

SourceWhat It ProvidesIntegration Method
Semrush / Ahrefs APISearch volume, difficulty, CPC, SERP features, competitor rankingsREST API via Mastra tool
Google Search Console APIOur impressions, clicks, CTR, average position per queryOAuth2 API via Mastra tool
Google Trends API (unofficial)Relative search interest over timeScraping or unofficial API
SERP ScraperFull SERP snapshots including AI Overview detectionCustom tool (SerpAPI or Browserbase)
Generative search monitorsAI Overview presence and citation trackingCustom OpenClaw cron or Mastra scheduled workflow

2.3 Our Content & Performance Database

Purpose: Complete inventory of all our published content, its structure, performance metrics, and metadata. This is what the agents use to understand "what we have" and "how it's performing."

Schema (core tables):

TableKey FieldsUpdate Frequency
our_pagesid, url, title, slug, content_type, status (published/draft/planned), published_at, last_updated, word_count, content_bodyOn publish/update
our_page_seoid, page_id, meta_title, meta_description, h1, h2s[], canonical_url, schema_markup, internal_links_out[], internal_links_in[], seo_scoreOn publish + weekly audit
our_page_performanceid, page_id, date, impressions, clicks, ctr, avg_position, sessions, bounce_rate, avg_time_on_pageDaily from GSC + Analytics
our_page_keywordsid, page_id, keyword_id, target (bool), current_position, position_change_7d, position_change_30dDaily
content_embeddingsid, page_id, chunk_index, embedding_vector, chunk_textOn publish/update

Key Design Decision: Maintain a vector embedding index of all our content (using Mastra's RAG pipeline). This powers internal linking automation, content gap detection, and duplicate/cannibalization detection.

2.4 Content Strategy & Planning Database

Purpose: The content calendar, strategy directives, brand voice guidelines, and editorial configuration that humans set and the AI operates against.

Schema (core tables):

TableKey FieldsUpdate Frequency
content_strategyid, name, description, target_audience, brand_voice_guidelines, content_pillars[], priorities, active (bool)Manual (human-set)
content_plan_itemsid, strategy_id, title, target_keyword_id, content_type, status (planned/in_progress/review/published), scheduled_date, assigned_agent_run_id, priority, notes, source (ai_generated/human_added)AI-generated + human-edited
content_briefsid, plan_item_id, outline, target_word_count, target_keywords[], competitor_references[], internal_link_targets[], research_notes, approved (bool)AI-generated, human-approved
content_draftsid, plan_item_id, version, content_body, seo_score, editor_notes, status (draft/edited/approved/rejected)AI-generated per pipeline run
brand_voice_samplesid, strategy_id, sample_text, tone_tags[], notesManual upload

Key Design Decision: The content_plan_items table has a source field — either ai_generated or human_added. Both the AI planning agent and human users can add items to the calendar. The AI respects human-added items as fixed constraints when generating its plans.


3. Strategy Layer — Agents & Workflows

This layer corresponds to the top section of your flow diagram. The agents here consume all the data infrastructure, synthesize it, and produce actionable content plans.

3.1 Competitive Intelligence Agent

1ID: competitive-intelligence-agent 2Model: anthropic/claude-sonnet-4-20250514

Role: Continuously analyzes the competitor database and produces strategic insights about competitor positioning, content gaps, and opportunities.

Inputs:

  • Competitor pages database (new and changed content)
  • Our content database (for comparison)
  • Keyword performance data

Tools:

  • query_competitor_pages — Searches competitor content by topic, keyword, or content type
  • query_our_pages — Searches our content for comparison
  • competitor_diff_summary — Gets recent competitor content changes
  • web_search — Validates findings against live web data

Outputs (structured, Zod-validated):

1{ 2 competitorMoves: [{ 3 competitor: string, 4 action: "new_content" | "content_update" | "new_topic", 5 details: string, 6 relevanceToUs: "high" | "medium" | "low", 7 suggestedResponse: string, 8 }], 9 contentGaps: [{ 10 topic: string, 11 competitorsCovering: string[], 12 ourCoverage: "none" | "weak" | "adequate", 13 opportunity: string, 14 estimatedImpact: "high" | "medium" | "low", 15 }], 16 positioningInsights: string, 17}

Schedule: Weekly (full analysis), daily (change digest only).

3.2 Search Landscape Agent

1ID: search-landscape-agent 2Model: anthropic/claude-sonnet-4-20250514

Role: Monitors keyword performance, SERP changes, AI Overview appearances, and search trends. Identifies where we're winning, losing, and where new opportunities are emerging.

Inputs:

  • Keyword + SERP snapshot database
  • Our page performance data (GSC)
  • AI Overview tracking data
  • Search trend data

Tools:

  • query_keyword_performance — Pulls our ranking data for specific keywords
  • query_serp_snapshots — Gets SERP landscape for keywords
  • query_ai_overview_tracking — Checks AI Overview presence and our citation status
  • semrush_keyword_research — Fetches fresh keyword data from Semrush API
  • gsc_performance_query — Pulls real-time data from Google Search Console

Outputs (structured):

1{ 2 rankingChanges: [{ keyword, previousPosition, currentPosition, trend, url }], 3 aiOverviewAlerts: [{ keyword, ourSiteCited, topCitedSources, recommendation }], 4 emergingKeywords: [{ keyword, volume, difficulty, relevance, opportunity }], 5 decliningContent: [{ url, keyword, positionDrop, suggestedAction }], 6 searchTrendShifts: [{ topic, direction, magnitude, implication }], 7}

Schedule: Daily for ranking changes, weekly for full landscape analysis.

3.3 Content Strategy Agent (The Planner)

1ID: content-strategy-agent 2Model: anthropic/claude-sonnet-4-20250514

Role: The brain of the Strategy Layer. Synthesizes outputs from the Competitive Intelligence Agent, Search Landscape Agent, our content inventory, our strategy directives, and brand guidelines to produce a prioritized content plan.

This is the agent that "thinks through all of these and comes up with a plan" from your diagram.

Inputs:

  • Competitive Intelligence Agent output
  • Search Landscape Agent output
  • Current content strategy (human-set pillars, priorities, brand voice)
  • Existing content plan items (both AI-generated and human-added)
  • Our content inventory + performance data
  • Content embeddings (to avoid duplication)

Tools:

  • query_content_strategy — Gets current strategy directives
  • query_content_plan — Gets existing planned/scheduled items
  • query_our_content_inventory — Searches our published content
  • vector_similarity_search — Checks if proposed topics overlap with existing content
  • web_search — Researches topics for viability and resource discovery
  • add_content_plan_item — Writes new items to the content calendar
  • update_content_plan_item — Modifies existing plan items

Workflow:

11. Load current strategy directives and priorities 22. Ingest Competitive Intelligence report 33. Ingest Search Landscape report 44. Review existing content plan (what's already scheduled) 55. Review our content inventory (what we've already published) 66. Identify gaps between strategy goals and current coverage 77. Generate candidate content ideas with rationale 88. Score and prioritize candidates by: 9 - Strategic alignment (does it serve our pillars?) 10 - Search opportunity (volume × 1/difficulty) 11 - Competitive urgency (are competitors gaining ground?) 12 - Content gap severity (do we have zero coverage?) 13 - GEO potential (can this be cited by AI search?) 149. Check for duplication/cannibalization via vector similarity 1510. Research top candidates (web search for viability, resource links) 1611. Produce prioritized content plan with scheduling recommendations 1712. Write plan items to database → SUSPEND for human review

Outputs:

1{ 2 planItems: [{ 3 title: string, 4 targetKeyword: string, 5 contentType: "blog_post" | "guide" | "landing_page" | "comparison" | "case_study", 6 rationale: string, // Why this piece, why now 7 competitiveContext: string, // What competitors are doing 8 suggestedScheduleDate: Date, 9 priority: 1 | 2 | 3, 10 estimatedImpact: string, 11 researchLinks: string[], // Pre-found resources to link/reference 12 internalLinkTargets: string[], // Our existing pages to link to/from 13 }], 14 strategyNotes: string, // Agent's high-level strategic thinking 15 calendarSummary: string, // Overview of the proposed schedule 16}

Human Interaction Point: After the agent generates the plan, the workflow suspends (Mastra .waitForEvent()). The human reviews the proposed calendar in the app UI, can approve items, reject items, edit items, add their own items, and reorder priorities. When they click "Approve Plan," the workflow resumes and the approved items are queued for execution.

3.4 Content Brief Agent

1ID: content-brief-agent 2Model: anthropic/claude-sonnet-4-20250514

Role: For each approved content plan item, generates a detailed content brief that the Writer Agent will execute against.

Inputs:

  • Approved content plan item (from calendar)
  • Competitor content analysis (what top-ranking pages cover)
  • Keyword data (primary, secondary, related terms)
  • Our existing content (for internal linking opportunities)
  • Brand voice guidelines

Tools:

  • query_serp_snapshots — Gets current SERP landscape for the target keyword
  • scrape_competitor_page — Extracts structure and content from top-ranking competitor pages
  • query_our_content_inventory — Finds internal linking opportunities
  • web_search — Finds additional resources, data sources, expert quotes
  • vector_similarity_search — Ensures the brief doesn't duplicate existing content

Outputs (structured):

1{ 2 title: string, 3 targetKeyword: string, 4 secondaryKeywords: string[], 5 searchIntent: string, 6 targetWordCount: number, 7 contentFormat: string, // "How-to guide", "Listicle", "Deep dive", etc. 8 outline: [{ 9 heading: string, 10 level: "h2" | "h3", 11 keyPoints: string[], 12 targetKeywords: string[], // Keywords to naturally include in this section 13 suggestedWordCount: number, 14 }], 15 competitorAnalysis: string, // What top 5 do well, where they fall short 16 differentiators: string[], // Our unique angles 17 internalLinkTargets: [{ 18 url: string, 19 anchorTextSuggestion: string, 20 contextNote: string, 21 }], 22 externalResources: [{ 23 url: string, 24 description: string, 25 useCase: string, // "Cite as source", "Link for reader", "Reference for accuracy" 26 }], 27 toneAndStyle: string, 28 audienceNotes: string, 29 seoRequirements: { 30 metaTitleGuideline: string, 31 metaDescriptionGuideline: string, 32 schemaType: string, 33 featuredSnippetTarget: boolean, 34 }, 35}

4. Content Layer — Agents & Workflows

This layer corresponds to the middle section of your flow diagram. It takes an approved content brief and produces a polished draft.

4.1 Writer Agent

1ID: writer-agent 2Model: anthropic/claude-sonnet-4-20250514 3Instructions: Dynamic — loads brand voice guidelines + brief-specific tone directives

Role: Receives a content brief and produces a complete first draft that follows the outline, hits the word count targets, incorporates keywords naturally, includes internal and external links, and matches the brand voice.

Key Prompting Strategy:

The Writer Agent's system prompt is assembled dynamically for each piece:

1Base Instructions (static): 2 - Writing quality standards 3 - Formatting rules (heading hierarchy, paragraph length, etc.) 4 - Link insertion patterns 5 - Keyword integration rules (natural, not stuffed) 6 7+ Brand Voice Guidelines (from content_strategy table): 8 - Tone descriptors 9 - Sample passages 10 - Vocabulary preferences / words to avoid 11 12+ Brief-Specific Directives (from content_briefs table): 13 - The full outline with section-level instructions 14 - Target keywords per section 15 - Differentiators to emphasize 16 - Internal/external links to include

Tools:

  • web_search — For real-time fact verification during writing
  • query_our_content — To check consistency with existing published content
  • vector_similarity_search — To find additional internal linking opportunities during writing

Output: Full markdown content body with frontmatter metadata, internal links, external links, and image placement markers ([IMAGE: description of needed image]).

4.2 Editor Agent

1ID: editor-agent 2Model: anthropic/claude-sonnet-4-20250514 3Instructions: Focused on language correctness, verbal consistency, and brand voice adherence

Role: Reviews the Writer Agent's draft for language correctness, verbal consistency, factual accuracy, brand voice adherence, and structural quality. Does NOT rewrite — provides specific edits and corrections.

Checks performed:

Check CategoryWhat It Evaluates
Language CorrectnessGrammar, spelling, punctuation, sentence structure
Verbal ConsistencyConsistent terminology (don't switch between "users" and "customers" randomly), consistent formatting of terms, consistent voice (active vs. passive)
Brand Voice AdherenceCompares against brand voice samples. Flags sections that drift from established tone
Factual ConsistencyCross-references claims with the brief's source material. Flags unsubstantiated statistics
Structural QualityHeading hierarchy compliance, section length balance, transition quality, intro/conclusion effectiveness
Link QualityVerifies all internal links point to real pages, checks anchor text naturalness, validates external link relevance
Keyword IntegrationConfirms target keywords appear in required positions (title, H1, first paragraph, H2s) without over-optimization

Output:

1{ 2 overallAssessment: "pass" | "needs_revision", 3 editsList: [{ 4 location: string, // Section/paragraph identifier 5 type: "grammar" | "voice" | "factual" | "structural" | "keyword" | "link", 6 severity: "critical" | "suggested", 7 original: string, 8 suggested: string, 9 rationale: string, 10 }], 11 voiceConsistencyScore: number, // 0-100 12 readabilityScore: number, // Flesch-Kincaid or similar 13 revisedContent: string, // Full draft with all edits applied 14}

Workflow Logic: If overallAssessment === "needs_revision" and critical edits exist, the revised content loops back through the Writer Agent with the edit list as context. Maximum 2 revision cycles, then force-proceed to human review.

4.3 Content Layer Workflow (Mastra)

1const contentLayerWorkflow = createWorkflow({ 2 id: "content-layer-pipeline", 3 inputSchema: z.object({ briefId: z.string() }), 4 outputSchema: z.object({ draftId: z.string(), seoScore: z.number() }), 5}) 6 .then(loadBriefStep) // Load the approved content brief 7 .then(writerAgentStep) // Writer produces first draft 8 .then(editorAgentStep) // Editor reviews and revises 9 .branch({ // Conditional: needs another pass? 10 condition: ({ editorOutput }) => 11 editorOutput.overallAssessment === "needs_revision" 12 && editorOutput.revisionCount < 2, 13 trueStep: writerRevisionStep, // Loop back to writer with edits 14 falseStep: finalizeDraftStep, // Proceed to final draft 15 }) 16 .then(saveDraftStep) // Persist final draft to database 17 .commit();

5. Production Layer — Agents, Checks & Publishing

This layer corresponds to the bottom section of your flow diagram. It takes the polished draft through human review, image generation, final edits, programmatic SEO validation, and publishing.

5.1 Human Review (Suspend/Resume)

This is the critical quality gate.

When the Content Layer produces a final draft:

  1. Draft is saved to content_drafts with status awaiting_review
  2. Notification sent (Slack, email, or in-app)
  3. Workflow suspends via .waitForEvent("human-review-complete")
  4. Human opens the draft in the app UI
  5. Human can:
    • Approve — proceed as-is
    • Edit — make changes directly, then approve
    • Reject — send back to Content Layer with notes (triggers a new Writer Agent run with human feedback as context)
    • Add personal experience/insights — the critical 20% from the 80/20 method
  6. On approval, workflow resumes with the human-edited content

UI Requirements:

  • Side-by-side view: AI draft vs. content brief
  • Inline editing with change tracking
  • Comment/annotation capability
  • SEO score preview (live-updating as human edits)
  • One-click approve/reject buttons

5.2 Image Generation Agent

1ID: image-generation-agent 2Model: Image generation API (DALL-E 3, Midjourney API, or Flux)

Role: Processes the [IMAGE: description] markers in the content and generates appropriate images.

Workflow:

  1. Parse all image markers from the approved content
  2. For each marker, generate a detailed image prompt based on the description and content context
  3. Generate the image via API
  4. Optimize the image (compress, resize to target dimensions)
  5. Generate SEO-optimized alt text
  6. Upload to CDN / media library
  7. Replace markers in content with proper <img> tags including alt text, width/height, and loading="lazy"

Output: Updated content body with all image markers replaced by actual image references + alt text.

5.3 Final Edits Agent

1ID: final-edits-agent 2Model: anthropic/claude-sonnet-4-20250514

Role: One last pass after human edits and image insertion. Ensures human edits didn't break formatting, images are properly placed, links still work, and the content is publication-ready.

Checks:

  • Markdown/HTML formatting validity
  • Image placement and alt text quality
  • Link integrity (no broken internal links)
  • Consistent formatting after human edits
  • Final readability pass

5.4 Programmatic SEO Validation (Must Score 10/10)

This is not an LLM agent — it's a deterministic, code-based validation engine. Every check has a binary pass/fail result. All 10 must pass.

1const seoChecks = { 2 1: { 3 name: "Meta Title", 4 check: (content) => { 5 // Length: 50-60 characters 6 // Contains primary keyword 7 // Unique (not used by any other page) 8 // No truncation risk 9 } 10 }, 11 2: { 12 name: "Meta Description", 13 check: (content) => { 14 // Length: 150-160 characters 15 // Contains primary keyword 16 // Includes call-to-action or value proposition 17 // Unique across site 18 } 19 }, 20 3: { 21 name: "Heading Hierarchy", 22 check: (content) => { 23 // Exactly one H1 24 // H1 contains primary keyword 25 // H2s use secondary keywords 26 // No skipped levels (H1 → H3 without H2) 27 // Logical nesting 28 } 29 }, 30 4: { 31 name: "Keyword Optimization", 32 check: (content) => { 33 // Primary keyword in: title, H1, first 100 words, at least one H2, meta description 34 // Keyword density: 0.5% - 2.5% (not over-optimized) 35 // Secondary keywords present naturally 36 // No keyword stuffing patterns detected 37 } 38 }, 39 5: { 40 name: "Internal Linking", 41 check: (content) => { 42 // Minimum 3 internal links 43 // All internal links point to valid, published pages 44 // Anchor text is descriptive (no "click here") 45 // Anchor text is diversified (not all exact-match keyword) 46 // Links are contextually relevant 47 } 48 }, 49 6: { 50 name: "External Linking", 51 check: (content) => { 52 // At least 1 external link to authoritative source 53 // External links use rel="noopener" on new-tab links 54 // No links to competitor domains (configurable blocklist) 55 // External links are contextually relevant 56 } 57 }, 58 7: { 59 name: "Content Quality Metrics", 60 check: (content) => { 61 // Word count meets target (within 10% of brief target) 62 // Readability score within acceptable range (configurable) 63 // No duplicate content detected (cosine similarity < threshold vs. existing pages) 64 // Paragraph length: no paragraphs over 300 words 65 // Sentence variety: mix of short and long sentences 66 } 67 }, 68 8: { 69 name: "Technical SEO", 70 check: (content) => { 71 // Valid schema markup (JSON-LD) present and parseable 72 // Canonical URL set correctly 73 // Open Graph tags present (og:title, og:description, og:image) 74 // Twitter Card tags present 75 // Image alt text on all images 76 // Image dimensions specified 77 } 78 }, 79 9: { 80 name: "URL & Slug", 81 check: (content) => { 82 // Slug is URL-friendly (lowercase, hyphens, no special chars) 83 // Slug contains primary keyword or close variant 84 // Slug length: under 60 characters 85 // No duplicate slug in our pages database 86 } 87 }, 88 10: { 89 name: "Mobile & Performance", 90 check: (content) => { 91 // All images have width/height (prevents CLS) 92 // Images use lazy loading 93 // No inline styles that break mobile 94 // Table responsiveness handled 95 // No excessively large embedded content 96 } 97 } 98};

Validation Flow:

  1. Run all 10 checks against the content
  2. If all pass → proceed to publish
  3. If any fail → generate a fix report, send back to Final Edits Agent with specific failure details, re-validate after fixes
  4. Maximum 3 fix cycles, then escalate to human with the failure report

Output:

1{ 2 score: "10/10" | "9/10" | etc, 3 passed: boolean, 4 checks: [{ 5 id: number, 6 name: string, 7 passed: boolean, 8 details: string, // What was checked 9 failureReason?: string, // Why it failed (if applicable) 10 autoFixable: boolean, // Can the agent fix this automatically? 11 }], 12}

5.5 Publishing Agent

1ID: publishing-agent 2Model: N/A (primarily tool-driven, minimal LLM reasoning)

Role: Takes the validated, 10/10 content and publishes it to the CMS.

Actions:

  1. Format content for CMS (markdown → CMS content blocks, or HTML)
  2. Upload images to CMS media library (if not already CDN-hosted)
  3. Set all metadata (title, description, slug, canonical, schema, OG tags)
  4. Set publication date (from content plan schedule)
  5. Create/update XML sitemap entry
  6. Ping Google Indexing API / IndexNow for fast crawling
  7. Update our content database (our_pages, our_page_seo)
  8. Generate content embeddings and add to vector index
  9. Run bidirectional internal linking (find places in existing content to link to the new page)
  10. Schedule social media distribution (if configured)
  11. Log publish event to audit trail

Post-Publish Monitoring:

  • 24-hour check: Verify page is indexed (Google Search Console)
  • 7-day check: Initial ranking data and impressions
  • 30-day check: Performance review against projected targets
  • Auto-flag underperforming content for refresh consideration

6. Master Workflow Orchestration

The entire system is orchestrated as a Mastra workflow that chains the three layers:

1┌─────────────────────────────────────────────────────────────────┐ 2│ SCHEDULED TRIGGERS │ 3│ │ 4│ Daily: Competitor crawl, SERP monitoring, rank tracking │ 5│ Weekly: Full competitive analysis, search landscape report │ 6│ On-demand: Human triggers plan generation │ 7│ On-schedule: Content calendar items trigger execution │ 8└─────────────────────────┬───────────────────────────────────────┘ 91011┌─────────────────────────────────────────────────────────────────┐ 12│ STRATEGY LAYER WORKFLOW │ 13│ │ 14│ 1. Competitive Intelligence Agent (async, data collection) │ 15│ 2. Search Landscape Agent (async, data collection) │ 16│ 3. Content Strategy Agent (synthesis + plan generation) │ 17│ 4. ── SUSPEND ── Human reviews/edits content plan ── RESUME ── │ 18│ 5. Content Brief Agent (generates brief per approved item) │ 19│ 6. ── SUSPEND ── Human approves brief ── RESUME ── │ 20└─────────────────────────┬───────────────────────────────────────┘ 212223┌─────────────────────────────────────────────────────────────────┐ 24│ CONTENT LAYER WORKFLOW │ 25│ │ 26│ 7. Writer Agent (produces first draft from brief) │ 27│ 8. Editor Agent (reviews, revises) │ 28│ 9. ── LOOP ── if needs_revision && cycles < 2 → back to 7 ── │ 29│ 10. Final draft saved to database │ 30└─────────────────────────┬───────────────────────────────────────┘ 313233┌─────────────────────────────────────────────────────────────────┐ 34│ PRODUCTION LAYER WORKFLOW │ 35│ │ 36│ 11. ── SUSPEND ── Human reviews draft (edits, adds │ 37│ experience, approves) ── RESUME ── │ 38│ 12. Image Generation Agent │ 39│ 13. Final Edits Agent (post-human cleanup) │ 40│ 14. Programmatic SEO Validation (10/10 required) │ 41│ 15. ── LOOP ── if score < 10/10 → auto-fix → revalidate ── │ 42│ 16. Publishing Agent (push to CMS, index, distribute) │ 43│ 17. Post-publish monitoring scheduled │ 44└─────────────────────────────────────────────────────────────────┘

Suspension Points (Human Touchpoints)

#WhereWhat Human DoesEstimated Time
1After content plan generationReview calendar, approve/reject/edit items, add own items15-30 min per planning cycle
2After content brief generationReview brief, confirm outline and direction5-10 min per brief
3After final draft producedDeep review, add personal insights, edit for voice, approve10-20 min per piece

Total human time per content piece: ~30-60 minutes (compared to 4-8 hours for fully manual content creation).


7. Application UI Requirements

The system needs a web application (likely Next.js given Mastra's TypeScript ecosystem) with these core views:

7.1 Dashboard

  • Content pipeline status (items in each stage)
  • Today's scheduled publications
  • Competitor change alerts (last 24h)
  • Ranking changes (significant movers)
  • AI Overview citation tracking
  • Content performance summary

7.2 Competitor Monitor

  • Competitor list with domain, page count, last crawled
  • New/changed page feed (chronological)
  • Per-competitor content analysis (topics, volume, quality)
  • Side-by-side content comparison (their page vs. ours on same topic)

7.3 Keyword & Search Performance

  • Keyword tracker with rankings over time
  • SERP feature tracking (featured snippets, AI Overviews)
  • Search volume trends
  • Keyword cluster view
  • GSC data integration (impressions, clicks, CTR)

7.4 Content Calendar

  • Calendar view of planned/scheduled content
  • Drag-and-drop rescheduling
  • Status indicators (planned → in_progress → review → published)
  • AI-generated items vs. human-added items (visually distinguished)
  • One-click to view brief, draft, or published piece
  • "Generate Plan" button that triggers the Strategy Agent

7.5 Content Editor / Review Interface

  • Full content preview
  • Side-by-side: brief vs. draft
  • Inline editing with change tracking
  • SEO score panel (live-updating)
  • Comment/annotation system
  • Approve / Request Changes / Reject buttons
  • Image preview and alt text editing

7.6 SEO Audit View

  • 10-point SEO check results for each piece
  • Historical SEO scores across all content
  • Site-wide SEO health metrics
  • Internal linking map visualization
  • Technical SEO issue tracker

7.7 Strategy Settings

  • Brand voice configuration (samples, tone descriptors)
  • Content pillars and priorities
  • Competitor list management
  • Target keyword management
  • Publishing workflow configuration (which checks are required, auto-publish vs. manual)

8. Technology Stack

ComponentTechnologyRationale
Agent FrameworkMastra (@mastra/core)TypeScript-native, workflow orchestration, RAG, tools, suspend/resume
RuntimeNode.js 20+ / BunMastra's supported runtimes
Web FrameworkNext.js 15+Mastra ecosystem alignment, SSR, API routes
DatabasePostgreSQL + pgvectorRelational data + vector embeddings in one database
ORMDrizzle ORMTypeScript-native, great Postgres support
Vector Searchpgvector (via Mastra RAG)Unified with primary database, no separate vector DB needed
LLM ProviderAnthropic Claude Sonnet 4 (primary)Quality + cost balance for content generation
Image GenerationDALL-E 3 or Flux APIHigh quality, API-accessible
SEO Data APISemrush or Ahrefs APIKeyword data, SERP snapshots, competitor data
Search ConsoleGoogle Search Console APIOur ranking/performance data
CMS IntegrationWordPress REST API, Sanity, or ContentfulDepends on existing CMS — tool adapter pattern
Job SchedulingTrigger.dev or Mastra cronDurable execution for scheduled jobs
NotificationsSlack API + emailHuman review notifications
HostingVercel (app) + Railway/Render (workers)Serverless for app, persistent processes for agents
MonitoringMastra Studio + custom observabilityAgent tracing, workflow debugging

9. Build Phases

Phase 1: Foundation (Weeks 1-3)

  • PostgreSQL schema setup (all tables defined in Section 2)
  • Mastra project scaffolding with agent/tool/workflow structure
  • Basic Next.js app shell with authentication
  • Google Search Console API integration (tool)
  • Semrush/Ahrefs API integration (tool)
  • Competitor sitemap crawler (scheduled job)
  • Competitor page scraper (scheduled job)
  • Our content inventory ingestion + embedding pipeline

Phase 2: Strategy Layer (Weeks 4-6)

  • Competitive Intelligence Agent
  • Search Landscape Agent
  • Content Strategy Agent (plan generation)
  • Content Brief Agent
  • Content Calendar UI (view, edit, approve)
  • Suspend/resume workflow for plan approval
  • Dashboard v1 (basic metrics)

Phase 3: Content Layer (Weeks 7-9)

  • Writer Agent with dynamic prompt assembly
  • Editor Agent with revision loop
  • Content Layer workflow with branching
  • Content Editor / Review UI
  • Brand voice configuration UI
  • Suspend/resume workflow for human review

Phase 4: Production Layer (Weeks 10-12)

  • Image Generation Agent
  • Final Edits Agent
  • Programmatic SEO Validation engine (10 checks)
  • Publishing Agent (CMS integration)
  • Post-publish monitoring jobs
  • SEO Audit View UI

Phase 5: Polish & Automation (Weeks 13-15)

  • Full end-to-end workflow testing
  • Internal linking automation (bidirectional)
  • AI Overview monitoring
  • Competitor change alert system
  • Performance dashboards
  • Prompt optimization based on output quality data
  • Documentation and runbooks

Phase 6: Optimization (Ongoing)

  • Mastra eval framework integration (measure content quality over time)
  • A/B test different agent prompts and models
  • Cost optimization (model routing based on task complexity)
  • Scale testing (50+ pieces/month throughput)
  • Content refresh pipeline (automated identification and updating of stale content)

10. Cost Estimation (Monthly, at Scale)

Cost CategoryEstimate (50 pieces/month)Notes
LLM API (Claude Sonnet)$200-400~$4-8 per piece across all agents
SEO Data API (Semrush)$119-229Business plan for API access
Image Generation$50-100~$1-2 per piece for 2-3 images each
Hosting (Vercel + Railway)$50-100App + background workers
PostgreSQL (managed)$25-50Neon, Supabase, or Railway Postgres
Google Search ConsoleFreeAPI access included
Total~$450-880/month

This compares favorably to manual content production costs of $200-500+ per piece at the same quality level.


11. Risks & Mitigations

RiskLikelihoodImpactMitigation
LLM output quality inconsistencyHighMediumMulti-agent review pipeline + human gate + programmatic checks
Google algorithm update targeting AI contentMediumHigh80/20 human-AI method ensures genuine expertise in every piece
API cost overruns at scaleMediumMediumToken budget per piece, model routing (cheaper models for simple tasks), caching
Competitor data accuracy (scraping failures)MediumLowFallback to API data, alerting on crawl failures, manual override
Hallucination in published contentMediumHighFact-check agent + human review + RAG grounding + source citation requirements
Over-optimization (content feels robotic)MediumMediumBrand voice samples, editor agent voice consistency checks, human tone review
Workflow complexity / debugging difficultyMediumMediumMastra Studio tracing, comprehensive logging, workflow versioning
CMS integration fragilityLowMediumAdapter pattern — swap CMS connectors without changing pipeline logic

This document is a living specification. It should be updated as architectural decisions are made during implementation and as the system evolves through testing and production use.

Alton Wells — AI Content Engine: Architecture & Scope | MDX Limo