MDX Limo
Auctor

ContentEngine — Executive System Overview

Author: Alton Wells Date: March 2026 Status: Final Architecture Specification Consolidates: Technical Specification v3 + Addendums #1, #2, #3


Table of Contents

  1. Executive Summary
  2. Technology Stack
  3. Data Architecture
  4. Master Workflow
  5. Agent System
  6. Strategy Layer
  7. Content Layer
  8. Production Layer
  9. Content Type Registry
  10. Refresh Queue System
  11. Observability & Audit Trail
  12. Cost Model
  13. Risk Matrix
  14. Future: Filesystem-as-Context

1. Executive Summary

ContentEngine is an autonomous AI content production system that replaces manual content marketing workflows with a three-layer agentic pipeline. The system is built on Mastra (TypeScript agent framework), LangExtract (structured document extraction), and Firecrawl (web crawling and sitemap intelligence). It manages the full content lifecycle from competitive intelligence through publication and performance monitoring.

The system is designed for Consul, a B2B SaaS AI executive assistant targeting CEOs and founders at a $200/month price point. Content must build trust with sophisticated, high-ticket decision-makers, making E-E-A-T signals, programmatic SEO validation, and editorial quality non-negotiable.

Core Architectural Principles

  • Humans set strategy and approve output. AI executes everything in between. Three human gates govern strategic decisions, editorial review, and final publication.
  • No vectors. No embeddings. Structured extractions, hierarchical summaries, and an explicit content relationship graph replace RAG. Stanford's 2025 research shows embedding precision collapses 87% beyond 50K documents. This approach scales without dimensional decay.
  • Programmatic SEO validation is non-negotiable. Every published piece must pass a 10/10 deterministic SEO check across 10 blocking checks, 6 warning checks, and 6 informational metrics.
  • Context is navigated, not stuffed. Agents traverse a four-level hierarchy (Domain → Cluster → Page → Entity) loading only what they need. Typical planning context: ~14,000 tokens instead of millions.
  • Everything is traceable. Every extraction maps to its source location. Every graph edge has provenance. Every agent decision is recorded in a structured audit trail with token usage, tool call sequences, and cost tracking.

2. Technology Stack

2.1 Core Framework

ComponentTechnologyRole
Mastra@mastra/core (TypeScript)Agent definitions, workflow orchestration, tool system, suspend/resume for human gates, Hono server generation
Vercel AI SDKFoundation layer under MastraUnified model routing, streaming, structured output, tool calling protocol
ZodSchema validationInput/output schemas for every agent, tool, and workflow step with compile-time type safety

2.2 Extraction & Intelligence

ComponentTechnologyRole
LangExtractPython library (FastAPI sidecar)Structured extraction from unstructured text. Source-grounded entities. Multi-pass extraction for high recall.
FirecrawlWeb crawling API/SDKCompetitor sitemap discovery, page crawling, content extraction. Handles JS-rendered pages and rate limiting.
Gemini 2.5 FlashLLM ($0.15/1M tokens)Extraction model. Fast, cheap, high quality for structured extraction tasks.

2.3 LLM Providers

Claude Sonnet 4 (anthropic/claude-sonnet-4-20250514) powers all nine Mastra agents across strategy, content, and production layers. Temperature is configured per agent: 0.8 for the Writer (creative generation), 0.4 for the Editor (precision), 0.1–0.3 for production agents (mechanical execution). Gemini 2.5 Flash handles all LangExtract extraction pipelines and hierarchical summary generation.

2.4 Data & Hosting

PostgreSQL (no pgvector) serves as the primary database for all structured data, extraction entities, graph adjacency tables, summaries, and content plans using JSONB for flexible extraction attributes. Drizzle ORM provides type-safe database access. The application layer uses Next.js 15+ for the UI (calendar, editor, dashboards) hosted on Vercel, with agent workers and the LangExtract sidecar running on Railway. Trigger.dev handles durable job scheduling for crawls, extractions, summary regeneration, and post-publish monitoring.

2.5 System Architecture

2.6 External APIs

APIPurpose
Semrush / AhrefsKeyword data, search volume, difficulty, SERP features, competitor rankings
Google Search ConsoleImpressions, clicks, CTR, average position per query (OAuth2)
Google Indexing / IndexNowFast crawl requests for newly published content
CMS AdapterWordPress REST / Sanity / Contentful via adapter pattern

3. Data Architecture

ContentEngine replaces conventional RAG/vector embeddings with a three-layer memory model. All data is stored in PostgreSQL as structured, queryable records.

3.1 Three-Layer Memory Model

3.2 Layer 1 — Structured Extraction (LangExtract)

Every document entering the system — competitor pages, our content, SERPs, AI Overviews, brand voice samples — is processed through LangExtract extraction pipelines. Raw text becomes structured, source-grounded entities stored in typed Postgres tables. Agents query structured data, not fuzzy similarity scores.

Six extraction classes are defined, each with a fixed schema, dedicated prompt description, and a minimum of three few-shot examples:

ClassEntity TypesTrigger
competitor_pagetopic, claim, keyword_signal, content_structure, cta, entity_referenceOn discovery or change (weekly scan)
our_pageSame as competitor + internal_linkOn publish or bootstrap
serpserp_result, serp_feature, paa_questionWeekly per tracked keyword
ai_overviewaio_claim, aio_structureWhen AIO detected in SERP
brand_voicetone_marker, vocabulary_preference, sentence_patternOn strategy create/update
keyword_dataDeferred for MVPDirect from Semrush API

Integration decision: LangExtract runs as a Python FastAPI sidecar (not the unofficial Node SDK) because critical features — multi-pass extraction, cross-chunk coreference resolution, and controlled generation via Gemini schema constraints — are Python-only. A circuit breaker pattern (3 consecutive failures opens the circuit) mitigates the Python–TypeScript bridge as a single point of failure.

3.3 Layer 2 — Hierarchical Document Summaries

A four-level summary tree enables agents to navigate from broad domain context to specific entity-level detail, loading only relevant branches:

Agents start at Level 0 and drill down only into relevant branches. Summaries are regenerated from fresh extractions and timestamped for versioning.

3.4 Layer 3 — Content Relationship Graph

An adjacency table in Postgres with typed edges connects content entities. This replaces vector similarity for all "find related content" operations. Each edge carries a confidence score, provenance (which agent or system created it), and a last-validated timestamp.

Edge TypeSource → TargetMeaning
covers_topicpage → topicPage covers this topic (with depth)
targets_keywordpage → keywordPage targets this keyword (with rank)
competes_withour_page → competitor_pagePages compete for same keyword
outperformscompetitor_page → our_pageCompetitor ranks higher for shared keyword
gaptopic → (null)Topic with competitor coverage but zero ours
cannibalizesour_page → our_pageBoth target same primary keyword
links_toour_page → our_pageActual internal link exists
should_link_toour_page → our_pageAgent-recommended linking opportunity
child_oftopic → topic_clusterHierarchical topic relationship

4. Master Workflow

ContentEngine operates across three workflow layers with human control at each transition. Workflows own the sequence ("what happens next"); agents own the execution ("how do I do this step"). This is deterministic orchestration with agentic execution — no supervisor/sub-agent patterns.

4.1 End-to-End Flow

4.2 Human Gates

Each human touchpoint is a Mastra workflow suspension. The workflow suspends and provides a structured payload describing the review task. The Next.js app renders the appropriate UI and calls the resume endpoint when the human completes their action.

GateHuman ActionsEst. Time
Gate 1: Calendar ReviewReview AI-generated content plan, approve/reject/edit items, add manual items, set schedule15–30 min per cycle
Gate 2: Brief ApprovalReview outline, confirm direction, adjust scope, approve or request changes5–10 min per brief
Gate 3: Draft ReviewDeep edit, add personal experience and insights, place images, final voice check, approve or reject15–30 min per piece

Design principle: Image placement is manual at Gate 3. The Writer Agent outputs [IMAGE: description] markers as placement suggestions. Image selection requires brand aesthetic judgment, rights verification, and contextual sensitivity that current AI image generation does not handle reliably at production quality.


5. Agent System

5.1 Agent Configuration

Every agent's model, temperature, and step budget are configurable at runtime via a database settings table. No model IDs are hardcoded. This allows model swaps without redeployment. Changes take effect on the next agent invocation.

AgentLayerModelMax StepsTemp
competitive-intelligenceStrategyClaude Sonnet 4120.5
search-landscapeStrategyClaude Sonnet 4100.5
content-strategyStrategyClaude Sonnet 4200.7
content-briefContentClaude Sonnet 4150.6
writerContentClaude Sonnet 4120.8
editorContentClaude Sonnet 4100.4
final-cleanupProductionClaude Sonnet 460.2
publishingProductionClaude Sonnet 4150.1
seo-autofixProductionClaude Sonnet 480.3

Temperature rationale: Strategy agents are moderate (creative planning grounded in data). The Writer is highest (creative generation). The Editor is low (precision). Production agents are near-zero (mechanical execution).

5.2 Tool Inventory (26 Tools)

Tools are grouped into four categories. Each agent receives only the tools it needs — a smaller tool surface means fewer irrelevant calls, lower token usage, and easier debugging.

CategoryCountTools
Shared8readDomainSummary, readClusterSummaries, readPageSummaries, queryExtractions, traverseContentGraph, webSearch, queryContentPlan, readContentStrategy
Strategy8queryCompetitorChanges, queryKeywordPerformance, querySerpExtractions, queryAiOverviewTracking, semrushKeywordResearch, gscPerformanceQuery, addContentPlanItem, updateContentPlanItem
Content3loadApprovedBrief, queryOurPages, verifyInternalLinks
Production7formatForCms, uploadToCms, setMetadata, pingIndexingApi, triggerPostPublishPipeline, schedulePostPublishMonitoring, logToAuditTrail

5.3 Tool-to-Agent Assignment

5.4 Context Navigation Pattern

The Content Strategy Agent — the most complex decision-maker with 10 tools — demonstrates how agents navigate the hierarchy to build precisely relevant context:

  1. Read active strategy directives (~500 tokens)
  2. Read combined domain summary for the big picture (~500 tokens)
  3. Ingest Competitive Intelligence report from workflow state (~2,000 tokens)
  4. Ingest Search Landscape report from workflow state (~2,000 tokens)
  5. Read per-pillar cluster summaries (~3,000 tokens)
  6. Check current calendar to avoid duplication (~1,000 tokens)
  7. Traverse graph for gaps, cannibalization, and outperformance (~1,500 tokens)
  8. Drill into specific page summaries for top candidates (~2,000 tokens)

Total: ~14,000 tokens of precisely relevant context, versus the impossibility of stuffing 847+ full pages into a context window.


6. Strategy Layer

6.1 Competitive Intelligence Agent

Continuously analyzes the competitor database — structured LangExtract data, not raw HTML — and produces actionable competitive insights. Runs weekly for full analysis and daily for a lightweight change digest.

Outputs: Competitor moves (with relevance scoring and source extraction IDs), content gaps (with estimated impact), positioning insights. Every finding includes provenance for traceability.

6.2 Search Landscape Agent

Monitors keyword performance, SERP composition, AI Overview appearances, and search trends using structured SERP data. Runs daily for ranking changes and AI Overview monitoring, weekly for full landscape analysis.

Outputs: Ranking changes with trend classification, AI Overview citation alerts, emerging keyword opportunities, declining content flags, SERP feature opportunities.

6.3 Content Strategy Agent

The brain of the system. Synthesizes both prior agent outputs with content inventory, strategy directives, and graph relationships to produce a prioritized, scheduled content plan.

Scoring model: strategic_alignment × search_opportunity × competitive_urgency × gap_severity. The agent checks the graph for cannibalization before recommending new content, respects human-added calendar items as fixed constraints, and suggests scheduling based on capacity.

Outputs: Prioritized plan items with title, target keyword, content type, rationale, competitive context, schedule date, priority (1–3), estimated impact, internal link targets, and graph evidence. Plan items are written to the database with source: "ai_generated" and the workflow suspends for human calendar review.


7. Content Layer

7.1 Content Brief Agent

For each approved plan item, generates a detailed brief including: complete H2/H3 outline with keyword mapping per section, competitor differentiation strategy (using structured extraction data), internal linking targets from the graph, external resource recommendations, and brand voice requirements. The brief is saved and the workflow suspends for human approval (Gate 2).

7.2 Writer Agent

Receives the approved brief, brand voice extractions, and strategy context as dynamic instructions. Produces a complete first draft that follows the outline exactly, hits word count targets (±10%), integrates keywords naturally, includes all specified links, and marks image placement opportunities as [IMAGE: description] for human insertion.

Deliberate constraint: The Writer has only 4 tools (webSearch, queryExtractions, traverseContentGraph, and the brief context). Most of its context comes from the pre-assembled brief, not from live queries.

7.3 Editor Agent

Reviews the draft against seven dimensions: language correctness, verbal consistency (terminology, voice), brand voice adherence (compared against extracted patterns), factual grounding (claims verified against brief sources and web), structural quality, link integrity (all internal links verified against published pages), and keyword optimization.

Outputs: Overall pass/needs_revision assessment, specific edits with location, type, severity (critical/suggested), and fix suggestions. Voice consistency score (0–100), readability score, and optionally a revised draft.

7.4 Revision Loop

If the Editor returns "needs_revision" with critical edits, the draft returns to the Writer with the edit list as additional context. Maximum 2 revision cycles. After 2 cycles, the draft proceeds to human review regardless — humans catch what agents miss. This prevents infinite loops while maintaining quality.


8. Production Layer

8.1 Human Draft Review (Gate 3)

The most important step in the entire system. The workflow suspends and presents the draft alongside the brief in a side-by-side editor interface. The human reviewer:

  • Reads and assesses the full draft against the brief
  • Adds personal experience and original insights (the irreplaceable 20%)
  • Places and curates images at [IMAGE:] markers
  • Edits for voice and brand consistency
  • Fact-checks statistics and claims
  • Approves or rejects (rejection sends back to Content Layer with notes)

8.2 Final Cleanup Agent

A lightweight technical-only pass after human edits. Checks markdown formatting validity, image alt text and dimensions, internal link resolution, heading hierarchy integrity, and consistent list formatting. Does not change tone, wording, or content.

8.3 SEO Validation Engine

Deterministic code, not an LLM. Operates as a CI/CD deployment gate with three severity tiers:

TierBehaviorChecks
BLOCKING (10)Publication halted. Auto-fix attempted (max 3 cycles). If still failing, escalate to human.Meta title, meta description, heading hierarchy, keyword presence, keyword density, internal linking, schema validity, URL/slug, date integrity, structural compliance
WARNING (6)Logged and tracked. Does not block. Contributes to content health score.Image density, external linking, readability score, content depth coverage, FAQ quality, mobile/performance
INFO (6)Logged for analytics and trending only.Word count delta, keyword density exact, schema richness, link density ratio, reading time, citation-ready block count

8.4 Publishing & Post-Publish Pipeline

The post-publish pipeline closes the data loop: extraction → summary regeneration (L2 → L1 → L0) → graph edge construction → bidirectional linking → monitoring. Content is not rolled back on pipeline failure — the content is live, and internal data syncs on the next scheduled job.


9. Content Type Registry

ContentEngine produces eight content types, each with a distinct structural template, schema mapping, internal linking profile, image density rule, and refresh cadence.

TypeLengthPrimary SchemaKey Requirements
Blog Post1,500–2,500 wordsBlogPosting + FAQPageMin 4 internal links, 3 images, FAQ required
Listicle1,500–3,000 wordsArticle + ItemList1 image per list item, numbered H2s required
Guide3,000–5,000 wordsArticle + FAQPageMin 8 internal links, 5 images, citation-ready definition block
How-To2,000–4,000 wordsHowTo + FAQPage1 image per step, troubleshooting section, HowToStep schema
Comparison2,000–4,000 wordsArticle + Product/ItemListHead-to-head or roundup variants, comparison table required
Case Study1,500–2,500 wordsArticle + ReviewCustomer quote required, min 2 quantified results
Pillar Page2,500–4,000 wordsCollectionPageMin 15 internal links, updated when child content publishes
Glossary500–1,200 wordsDefinedTerm + FAQPageUnder 50-word definition block, related term cross-links

Each type carries a structural scaffold that the Writer Agent must follow exactly and the SEO Validation Engine checks at the deployment gate. Every page carries a @graph array of JSON-LD entities including Organization, WebSite, WebPage, the content-type-specific schema, author Person entity, and BreadcrumbList.

9.1 Schema Architecture

9.2 Author Entities & E-E-A-T Strategy

Three author entities are defined, each with a dedicated profile page at /authors/[slug]/:

  • Alton Wells — Primary human author with full E-E-A-T credentials
  • Stan — Secondary human author
  • Auctor — AI editorial assistant, presented transparently. Profile describes the human-AI editorial process: articles are researched and drafted using AI, then reviewed, enriched with original insights, and approved by a human author.

Author profile pages include headshot/avatar, bio, expertise tags linked to pillar pages, social links, recent article feed, and ProfilePage + Person schema markup.


10. Refresh Queue System

Every published page is re-evaluated on a maximum 90-day cycle. Re-evaluation does not mean automatic refresh — it means the system scores refresh urgency and only pages crossing a threshold enter the active queue.

10.1 Urgency Score (0–100)

SignalWeightData SourceScoring Logic
Position Decay30%GSC performance(position drop / 10) × 100, capped at 100
Traffic Decline25%GSC performance(% impression decline, 30d vs prior 30d) × 100
Content Age20%our_pages.last_updated(days since update / 90) × 100
Competitive Displacement15%Graph outperforms edgesNew/worsened outperforms edges × 25
Factual Staleness10%Extraction claim entities(dated claims older than current year / total) × 100

10.2 Refresh Tiers

Refreshes are mixed into the content calendar at a 70/30 ratio (new content / refreshes). Light refreshes consume 0.25 capacity units, moderate 0.5, and heavy 1.0. As the content library grows, the ratio naturally shifts. If the refresh queue is empty, 100% of capacity goes to new content.


11. Observability & Audit Trail

Every agent invocation produces exactly one audit log row capturing workflow context, timing, token usage, cost estimates, the ordered sequence of tool invocations, and output summaries.

11.1 What Gets Logged

  • Workflow ID, run ID, step ID, and agent ID for full traceability
  • Model ID used (resolved from config at invocation time)
  • Start time, completion time, and duration in milliseconds
  • Steps used vs. max steps configured
  • Input tokens, output tokens, total tokens, and estimated cost (USD)
  • Ordered tool call sequence with input summaries, output summaries, per-call duration, and token usage
  • Status: success, failed, failed_max_steps, or timeout
  • Error message and type if applicable

11.2 Dashboard Views

  • Cost by period: daily/weekly/monthly spend breakdown by agent
  • Agent efficiency: average steps used, max-steps failures, step utilization ratio
  • Workflow run detail: full step-by-step timeline for any specific run
  • Slowest agents: average and max duration for performance optimization

11.3 Error Handling

  • maxSteps policy: When an agent exhausts its step budget, the workflow captures the partial output, marks the step as failed_max_steps, logs the full tool call history, sends a Slack notification, and halts the workflow run. This is not a silent failure.
  • Malformed output: Zod validation catches invalid structured output. One retry with a corrective note. If retry fails, step fails.
  • Tool failures: Agents see error messages and can retry or use alternative approaches. The LangExtract sidecar uses a circuit breaker (3 consecutive failures opens the circuit, 60-second recovery window).
  • Post-publish pipeline: Each job retries 3× with exponential backoff (10s, 30s, 90s). Failed jobs do not roll back the publication — the content is live, but internal data is temporarily out of sync until the next scheduled job catches up.

11.4 SEO Validation Tracking

Every validation run (pre-publish, refresh evaluation, scheduled weekly audit, manual) is stored with full check results. A weekly Trigger.dev job runs the complete validation suite against all published pages to catch degradation from external changes. Validation score decay feeds into the refresh urgency score.


12. Cost Model

Monthly estimates based on 50 published pieces, 10 competitors, and 100 tracked keywords:

CategoryMonthly EstimateNotes
Claude Sonnet 4 (all agents)$200–400~$4–8 per piece across strategy, writing, editing, briefs, cleanup
Gemini 2.5 Flash (extraction + summaries)$50–100Extraction itself is ~$1.62/mo; bulk is summary generation
Firecrawl$40–80Competitor sitemap crawling + page scraping
Semrush API$119–229Business plan for keyword/SERP API access
Hosting (Vercel + Railway)$50–100App + workers + LangExtract sidecar
PostgreSQL (managed)$25–50Neon, Supabase, or Railway
GSC API / Image generation$0Free API; images are human-placed
Total$485–960

13. Risk Matrix

RiskProb.ImpactMitigation
LangExtract extraction quality inconsistentMedMedMulti-pass extraction, high-quality few-shot examples (versioned in git, tested via regression suite), validation checks
Hierarchical summaries drift from sourceMedMedSummaries regenerated daily from fresh extractions; timestamped and versioned
Graph relationship stalenessMedLowWeekly re-validation; confidence scores decay over time; stale edges flagged
Python–TypeScript bridge failureLowHighHealth check endpoint, circuit breaker pattern, auto-restart, fallback to queued retry
Gemini model version driftMedMedPin model version, 10-document regression suite, 15% extraction change threshold blocks deployment
LLM output quality varianceHighMedMulti-agent review pipeline + human gate + programmatic SEO checks
Google targeting AI contentMedHigh80/20 human-AI method ensures genuine Experience + Expertise in every piece
Hallucination in published contentMedHighFact-check via extracted claims + human review + LangExtract source grounding
Content cannibalization at scaleMedMedGraph cannibalizes edges + Strategy Agent checks before planning

14. Future: Filesystem-as-Context

The current hierarchical summary approach works well but has a ceiling: summaries are pre-generated snapshots. As the content library scales to thousands of pages, keeping summaries fresh becomes continuous compute cost, and pre-computing what context agents need is inherently wasteful.

The filesystem-as-context pattern (inspired by Andrej Karpathy's context engineering framework and Anthropic's Skills system) offers a superior approach at scale: instead of pre-loading context, structure all system knowledge as a navigable filesystem. Agents use ls, grep, glob, and file reading to pull exactly the context they need for the current task.

Context efficiency gain: An agent navigating the filesystem builds ~3,500 tokens of precisely relevant context for a planning task, compared to ~14,000 tokens with the hierarchical summary approach — because the agent decides what to load based on the actual task.

Implementation effort: 5–7 weeks on top of the base system: filesystem generation pipeline (2–3 weeks), sandbox environment per agent session (1–2 weeks), filesystem-aware agent prompts (1 week), and a hybrid SQL + filesystem approach for real-time data.

Recommended path: Build the base system using hierarchical summaries + graph first. Validate at current scale. Implement the filesystem layer when agent context quality becomes a bottleneck (likely at 500+ pages, 10+ competitors, 50+ pieces/month).


This is a living specification. Addendums #1–3 provide full implementation detail for each layer.