MDX Limo
Untitled

Auctor: Full Audit & Path to Content Production

Context

Auctor is the content engine for Consul (https://consul.so). The system has 34+ DB tables, Firecrawl/DataForSEO integrations, 3 Mastra workflows, 40+ agent tools, and an MCP server — but we need to verify everything is wired up, fix the domain (currently hardcoded as consul.ai, should be consul.so), populate data, and test end-to-end before we can start producing content.

This plan audits current state, fixes issues, populates data, and tests the full research → plan → brief → draft → publish pipeline.

Phase 0: Fix Domain Reference (consul.ai → consul.so)

The owned domain is hardcoded as consul.ai in multiple places. Fix before anything else so all crawls, rankings, and publishing use the correct domain.

Files to update: frontend/src/content-engine/defaults.ts (lines 15, 52)

defaultSiteTarget.ownedDomain: 'consul.ai' → 'consul.so' defaultStrategyDirective.directives.ownedDomain: 'consul.ai' → 'consul.so' frontend/drizzle/0001_auctor_integration_custom.sql (lines 330, 384)

Seed insert owned_domain: 'consul.ai' → 'consul.so' URL template: 'https://consul.ai/' → 'https://consul.so/' frontend/src/mastra/agents/auctor-agent.ts — search for consul.ai in agent instructions

frontend/lib/pipeline-configs.ts — any brand voice references

DB update — fix existing site_targets row:

UPDATE auctor.site_targets SET owned_domain = 'consul.so' WHERE site_key = 'consul'; UPDATE auctor.strategy_directives SET directives = jsonb_set(directives, '{ownedDomain}', '"consul.so"') WHERE id = 'strategy-consul-default'; Phase 1: Environment & Connectivity Verification

1.1 Pull latest env vars cd /Users/altonwells/conductor/workspaces/auctor/chennai npx dotenv-vault@latest pull 1.2 Check required env vars Must have (system won’t function without these):

Copy Variable Purpose NEXT_PUBLIC_SUPABASE_URL Supabase project URL NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY Public Supabase key (fallback: NEXT_PUBLIC_SUPABASE_ANON_KEY) SUPABASE_SERVICE_ROLE_KEY Full-privilege DB access DATABASE_URL Direct Postgres for Mastra workflow state ANTHROPIC_API_KEY Powers all agent execution (Claude) LANGEXTRACT_API_KEY Gemini key for LangExtract extraction pipelines ENCRYPTION_KEY AES-256 for storing integration secrets in DB Need for content production:

Copy Variable Enables FIRECRAWL_API_KEY Competitor page discovery & crawling DATAFORSEO_LOGIN + DATAFORSEO_PASSWORD Search volume, keyword difficulty, SERP snapshots Nice to have:

Copy Variable Enables OPENAI_API_KEY GPT-4o model option for agents GOOGLE_GENERATIVE_AI_API_KEY Gemini agent model option GSC_CLIENT_EMAIL + GSC_PRIVATE_KEY + GSC_SITE_URL + GA4_PROPERTY_ID Google Search Console performance data 1.3 Start dev server & run diagnostics pnpm dev Then run these three checks:

Readiness baseline (your scorecard — re-run after every phase):

curl -s http://localhost:3000/api/content-engine/readiness | python3 -m json.tool Integration status:

curl -s http://localhost:3000/api/content-engine/integrations | python3 -m json.tool Anthropic and Gemini must show isSet: true.

DB connectivity: Open http://localhost:3000/settings/db-check

Service-role client reads auctor.site_targets — must succeed Publishable-key client reads consul.posts — must succeed PGRST106 errors mean schema isn’t exposed in Supabase Dashboard > API Settings Phase 2: Database Schema & Data Audit

2.1 Verify all 14 migrations applied Confirm these table groups exist in the auctor schema (via Supabase SQL editor or MCP):

Content pipeline: site_targets, strategy_directives, content_plan_items, content_briefs, content_drafts, editor_reviews, seo_validation_runs Documents & knowledge: documents, document_extractions, summary_nodes, graph_edges Competitive intel: competitor_domains, page_snapshots, page_seo_metadata, page_diffs, domain_rankings Keywords & SEO: tracked_keywords, keyword_clusters, keyword_metrics, serp_snapshots, serp_feature_ownership, ai_visibility_data, gsc_performance Ops: agent_runtime_config, agent_audit_logs, sync_runs, activity_log, automations, agent_sessions, system_events, api_cost_log, extraction_configs, integration_secrets consul schema: posts, authors, categories, keywords, post_keywords Missing tables → run corresponding migration from frontend/drizzle/ (0000–0013).

2.2 Verify bootstrap seed data SELECT site_key, name, owned_domain FROM auctor.site_targets; -- Expected: 1 row, site_key='consul', owned_domain='consul.so' (after Phase 0 fix)

SELECT id, status FROM auctor.strategy_directives; -- Expected: 1 row with status='active'

SELECT agent_id, model_id FROM auctor.agent_runtime_config; -- Expected: 1 row for 'auctor' agent

SELECT COUNT(*) FROM auctor.extraction_configs; -- Expected: 9 (5 pipeline + 4 content-type presets) If missing, hit the readiness endpoint — bootstrap auto-runs on first request.

2.3 Snapshot existing data SELECT 'competitor_domains' as table_name, COUNT() as rows FROM auctor.competitor_domains UNION ALL SELECT 'documents', COUNT() FROM auctor.documents UNION ALL SELECT 'document_extractions', COUNT() FROM auctor.document_extractions UNION ALL SELECT 'tracked_keywords', COUNT() FROM auctor.tracked_keywords UNION ALL SELECT 'keyword_clusters', COUNT() FROM auctor.keyword_clusters UNION ALL SELECT 'keyword_metrics', COUNT() FROM auctor.keyword_metrics UNION ALL SELECT 'domain_rankings', COUNT() FROM auctor.domain_rankings UNION ALL SELECT 'summary_nodes', COUNT() FROM auctor.summary_nodes UNION ALL SELECT 'graph_edges', COUNT() FROM auctor.graph_edges UNION ALL SELECT 'content_plan_items', COUNT() FROM auctor.content_plan_items UNION ALL SELECT 'content_briefs', COUNT() FROM auctor.content_briefs UNION ALL SELECT 'content_drafts', COUNT() FROM auctor.content_drafts UNION ALL SELECT 'page_snapshots', COUNT() FROM auctor.page_snapshots UNION ALL SELECT 'serp_snapshots', COUNT() FROM auctor.serp_snapshots UNION ALL SELECT 'activity_log', COUNT() FROM auctor.activity_log UNION ALL SELECT 'consul.posts', COUNT() FROM consul.posts UNION ALL SELECT 'consul.keywords', COUNT(*) FROM consul.keywords; Record these numbers. Everything at 0 needs to be populated in Phase 4.

Phase 3: Integration Configuration

3.1 Open http://localhost:3000/settings/integrations 3.2 Required — configure if missing Anthropic — all agent workflows depend on this Google Gemini — LangExtract extraction pipelines 3.3 Needed for content production Firecrawl (https://www.firecrawl.dev/app/api-keys) — without this, no competitor crawling DataForSEO (https://app.dataforseo.com/api-access) — without this, no keyword volume/difficulty data 3.4 Verify agent config http://localhost:3000/settings/agents — confirm “auctor” agent has a model selected and reasonable settings (temp ~0.5, maxSteps ~25).

Phase 4: Data Population (Getting Intelligence Into Auctor)

This is the critical phase — the workflows can’t produce good content without competitive and keyword intelligence.

4.1 Sync existing Consul content If consul.posts has published articles, sync them into Auctor as “our pages”:

curl -X POST http://localhost:3000/api/content-engine/sync/inbound
-H 'Content-Type: application/json'
-d '{"siteKey": "consul", "source": "consul"}' This creates documents with class our_page — the system needs these to understand what Consul has already published and avoid duplication.

4.2 Onboard competitor domains (requires Firecrawl) Navigate to http://localhost:3000/competitors. Add at least 3 competitors that compete with Consul in the AI assistant / executive AI space.

The onboard pipeline runs 9 stages per competitor:

Register domain in competitor_domains Discover all URLs via Firecrawl sitemap mapping Crawl pages, extracting markdown + HTML → page_snapshots Extract metadata (titles, H1s, schema markup) → page_seo_metadata Detect changes from previous snapshots → page_diffs Collect rankings (what keywords they rank for) → domain_rankings Sync to documents → documents (class: competitor_page) Derive keywords from their content → tracked_keywords Score threat level → updates competitor_domains.threat_score and tier API alternative:

curl -X POST http://localhost:3000/api/content-engine/competitors/onboard
-H 'Content-Type: application/json'
-d '{"domain": "competitor.com", "name": "Competitor Name"}' After each onboard, check http://localhost:3000/competitors — you should see page counts, crawl timestamps, and threat scores.

4.3 Seed and enrich tracked keywords Navigate to http://localhost:3000/keywords. The competitor onboarding (4.2) should have auto-derived keywords. Review and add any missing ones — aim for 10+ keywords covering Consul’s target topics.

If DataForSEO is configured, trigger metric enrichment from the Keywords page to populate:

Search volume and monthly trends CPC and competition scores Keyword difficulty Search intent classification (informational, commercial, navigational, transactional) 4.4 Run keyword clustering After keywords have metrics, clustering groups them by SERP result overlap into keyword_clusters. Each cluster gets a pillar keyword, total volume, avg difficulty, and opportunity score. This drives the strategy workflow’s content calendar decisions.

4.5 Verify knowledge graph After documents are synced and extraction pipelines have run, check that summary_nodes and graph_edges are being populated. These power the agent’s ability to synthesize competitive intelligence.

4.6 Re-check readiness curl -s http://localhost:3000/api/content-engine/readiness | python3 -m json.tool Targets for content production readiness:

Copy Check Target What populates it Active Competitors >= 3 Step 4.2 Competitor Pages >= 20 Step 4.2 (Firecrawl) Our Pages >= 5 Step 4.1 (consul.posts sync) SERP Snapshots >= 5 Step 4.3 (DataForSEO) Extractions >= 10 Auto from document sync + LangExtract Tracked Keywords >= 10 Steps 4.2 + 4.3 Summary Nodes >= 5 Auto from extraction pipeline Graph Edges >= 5 Auto from extraction pipeline Strategy Directive 1 Bootstrap seed Required Integrations all green Phase 3 All checks should be pass or warn before proceeding to workflow testing.

Phase 5: Test the Content Production Pipeline

This is the end-to-end test of Research → Plan → Brief → Draft → Publish.

5.1 Run Strategy Planning workflow curl -X POST http://localhost:3000/api/content-engine/workflows/strategy
-H 'Content-Type: application/json'
-d '{"siteKey": "consul"}' The workflow:

Analyzes all competitive intelligence (competitor pages, rankings, threat scores) Maps the search landscape (keyword clusters, SERP features, gaps) Generates a prioritized content calendar (plan items with target keywords, content types, rationale) Suspends for your review at the calendar gate Check http://localhost:3000/strategy/calendar — review proposed plan items.

5.2 Approve the content calendar curl -X POST http://localhost:3000/api/content-engine/workflows/resume
-H 'Content-Type: application/json'
-d '{"runId": "", "stepId": "calendarReview", "payload": {"approved": true}}' Plan items are now persisted to content_plan_items with status approved.

5.3 Generate a content brief Pick a plan item and run brief generation:

curl -X POST http://localhost:3000/api/content-engine/workflows/brief
-H 'Content-Type: application/json'
-d '{"planItemId": "", "siteKey": "consul"}' The agent produces a brief with:

Angle and differentiation from competitors Detailed outline (sections, subsections) Primary + supporting keyword mapping Internal linking opportunities Voice and tone requirements (per strategy directive) Suspends for brief approval. Review at http://localhost:3000/briefs/.

5.4 Approve the brief curl -X POST http://localhost:3000/api/content-engine/workflows/resume
-H 'Content-Type: application/json'
-d '{"runId": "", "stepId": "briefApproval", "payload": {"approved": true}}' 5.5 Produce a draft curl -X POST http://localhost:3000/api/content-engine/workflows/draft
-H 'Content-Type: application/json'
-d '{"briefId": "", "siteKey": "consul"}' The workflow runs three steps:

Writer — generates full markdown from the brief Editor — reviews for voice consistency, readability, factual accuracy Cleanup — technical formatting, SEO metadata, FAQ blocks, schema markup Suspends for draft review. Check http://localhost:3000/drafts/.

5.6 Validate before publishing (dry run) curl -X POST http://localhost:3000/api/content-engine/mcp-tool
-H 'Content-Type: application/json'
-d '{"tool": "auctor_publish", "input": {"draftId": "", "dryRun": true}}' Returns a validation report: SEO score, blocking issues, warnings, internal link status.

5.7 Verify the activity log http://localhost:3000/activity — every workflow start, agent run, suspension, approval, and config change should appear here as a complete audit trail.

5.8 Test MCP tools (for Claude Code integration) From a Claude Code session in this project:

auctor_context with resource='pipeline' — pipeline status overview auctor_intel with a keyword — deep-dive keyword intelligence auctor_list_content with stage='plan' — list plan items auctor_list_content with stage='draft' — list drafts What This Gets You

After completing all phases, the system is ready for continuous content production:

Competitive intelligence is live — 3+ competitors crawled, ranked, and threat-scored Keyword landscape is mapped — 10+ keywords with volume, difficulty, intent, clusters Strategy workflow produces a calendar — AI-generated, human-approved content plan Brief workflow creates detailed outlines — competitive differentiation built in Draft workflow writes, edits, and validates — full markdown with SEO metadata Publishing pipeline validates and ships — dry-run then publish to consul.so Activity log tracks everything — complete audit trail of every action MCP tools let Claude Code drive it — research/plan/produce/publish from the terminal Key Files Reference

Copy Purpose File Readiness endpoint frontend/app/api/content-engine/readiness/route.ts DB clients frontend/src/content-engine/db/supabase.ts Repository layer frontend/src/content-engine/db/repositories.ts Ingestion repos frontend/src/content-engine/db/ingestion-repositories.ts Schema types frontend/src/content-engine/db/schema.ts Bootstrap seed frontend/src/content-engine/db/bootstrap.ts Site defaults (domain fix) frontend/src/content-engine/defaults.ts Integration defs frontend/src/content-engine/integrations.ts Env loading frontend/src/content-engine/env.ts Firecrawl adapter frontend/src/content-engine/services/firecrawl.ts Competitor crawl frontend/src/content-engine/services/competitor-crawl.ts DataForSEO adapter frontend/src/content-engine/adapters/dataforseo.ts Keyword derivation frontend/src/content-engine/services/keyword-derivation.ts Keyword clustering frontend/src/content-engine/services/keyword-clustering.ts Workflows (strategy/brief/draft) frontend/src/mastra/workflows/content-engine-workflows.ts Agent tools (40+) frontend/src/mastra/tools/content-engine-tools.ts MCP server scripts/mcp-server.mts Activity log frontend/src/content-engine/db/activity-log.ts Agent runner frontend/src/content-engine/services/agent-runner.ts Migration 0001 (domain fix) frontend/drizzle/0001_auctor_integration_custom.sql