MDX Limo
Content-as-Code: A Git-Native Filesystem Architecture for Auctor

Content-as-Code: A Git-Native Filesystem Architecture for Auctor

The Core Thesis

Claude Code is the most sophisticated file-manipulation agent ever built. It reads, writes, edits, searches, diffs, and versions files with precision that exceeds most human developers. Yet when applied to content creation inside Auctor, it is forced to operate through database abstractions that strip away every native advantage. Content lives in PostgreSQL rows. The agent's most powerful tools sit idle while every operation flows through MCP wrappers, HTTP round-trips, and SQL queries.

This document argues that content creation is structurally isomorphic to software development, and proposes an architecture where content lives on the filesystem under git version control — with PostgreSQL serving as a materialized view for the web UI rather than the source of truth.

The result: every git primitive becomes a content operation, and every capability that makes Claude Code exceptional at writing software transfers directly to creating, editing, reviewing, and publishing content.


1. The Problem: Database Rows Are the Wrong Abstraction

1.1 Current State

Auctor is a multi-agent content operations platform with substantial infrastructure already built:

  • Desktop app (~90% complete): Full bootstrap sequence, SQLite store, Claude Code SDK integration with session management, MCP server bridge (14 tools), tool policy, scheduler with 4 cron cycles, file watcher, Next.js + Python sidecar management, IPC handlers (13/15 implemented).
  • Frontend (~95% complete): Harness with 5 subagents (Explorer, Researcher, Writer, Editor, Validator) across 3 modes, 3 Mastra workflows with human suspension gates, 9 content engine agents, 28 tools, full CRUD API routes, agent window UI with dual transport, publishing pipeline.
  • Data layer (substantially complete): Supabase dual-client, 30+ tables across 4 schemas, live sync for Consul posts, Firecrawl competitor crawling, DataForSEO keyword/SERP research, LangExtract document extraction, Google Analytics/GSC API.

The content lifecycle flows through well-defined stages, each stored as a database row:

1ContentPlanItem → ContentBrief → BlogPostDraft → EditorReview → SeoValidation → consul.posts

A ContentBrief is a JSON blob with fields like angle, outline, keywordMapping. A BlogPostDraft is a row containing markdownBody (the entire article as a single text field), metaTitle, metaDescription, faqBlocks, internalLinks, schemaGraph.

1.2 The Impedance Mismatch

When a Claude Code session runs in the workspace, these native tools are available:

ToolCapabilityCurrent Usage for Content
ReadRead any file with line numbersReads CLAUDE.md and memory files only
EditSurgical string replacementEdits memory files only
WriteCreate or overwrite filesWrites memory files only
GlobPattern-match filenamesUnused for content
GrepRegex search across filesUnused for content
git diffLine-by-line change comparisonAuto-commit only
git logFull revision historyUnused
git branchParallel workstreamsUnused
BashRun arbitrary scriptsExplicitly disallowed

Instead, every content operation routes through: MCP server → HTTP POST → Next.js API → repository function → SQL → PostgreSQL. The agent calls mcp__auctor__loadApprovedBrief({ briefId }) and receives a JSON blob. It calls mcp__auctor__formatForCms({ title, slug, markdownBody, metadata }) to push content back.

1.3 What Is Lost

No surgical editing. The markdownBody field is a single text column. To change paragraph 3 of a 2,000-word article, the agent must rewrite the entire field. The Edit tool cannot be used because the content isn't in a file.

No cross-content search. No MCP tool searches across all published articles. If the agent needs "every article mentioning competitor X" or "all places linking to a specific URL," it cannot. Grep would do this trivially with files.

No change tracking. EditorReview stores edits as a JSON array of { location, type, severity, suggestion } objects. There is no way to see what actually changed between revision 1 and revision 2. git diff would provide this for free.

No revision history. The database stores current state only. When a draft is revised, the previous version is overwritten. git log would preserve every intermediate state.

No parallel experimentation. No way to try two angles for the same article and compare them. Git branches exist for exactly this.

No content-aware validation. Bash is fully disallowed. The agent cannot run scripts against content. Every validation check requires a custom MCP tool implementation.

No template compliance. No way to diff a draft against its expected structure template. git diff between a template and content would show exactly where the draft diverges.


2. The Code-Content Isomorphism

2.1 The Structural Parallel

This is not a superficial analogy. It is a structural isomorphism that holds at every level:

Software DevelopmentContent CreationShared Abstraction
Read existing code, docs, issuesRead competitor content, SERP data, brand voiceResearch: gather context from existing artifacts
Architecture design, create specsContent strategy, create briefsPlanning: define structure before building
Write code in filesWrite content in documentsAuthoring: produce artifacts in structured formats
Code review (PR review)Editorial review (voice, accuracy, structure)Review: peer evaluation against quality standards
Tests + lintingSEO validation + readability scoringValidation: automated quality gates
CI/CD pipelineCMS formatting + publishingDeployment: transform and ship to production
Monitoring + alertingPerformance tracking + ranking changesObservation: post-deployment feedback loops
Bug fix from production feedbackContent revision from analyticsIteration: close the loop

2.2 The Agent's Native Language

What Claude Code does for software:

11. Read the spec/issue → Read tool 22. Explore the codebase for context → Glob + Grep + Read 33. Write or edit implementation files → Write + Edit 44. Run tests to validate → Bash 55. Review changes with git diff → Bash (git diff) 66. Commit with a meaningful message → Bash (git commit) 77. Create a PR for human review → Bash (gh pr create)

What the agent does for content today:

11. Load the brief → mcp__auctor__loadApprovedBrief 22. Query extractions for context → mcp__auctor__queryExtractions 33. Generate entire draft as JSON blob → mcp__auctor__formatForCms 44. Run SEO validation → (only via workflow step) 55. Review changes → (not possible — no diff) 66. Save to database → mcp__auctor__uploadToCms 77. Human reviews in web UI → (separate system)

The first workflow is natural, composable, and leverages the agent's full capability. The second is constrained, monolithic, and forces every operation through a narrow API.

2.3 Git Primitives as Content Operations

Git PrimitiveContent Operation
RepositoryContent project (one site)
main branchCanonical published state
content/<slug> branchIn-progress content piece
experiment/<slug>/<variant> branchA/B angle testing
project/<name> branchMulti-piece coordinated project (pillar + cluster)
CommitContent revision with structured message
Merge to mainReview gate (brief approval, draft approval)
pre-commit hookValidation pipeline (SEO, readability, links)
post-commit hookDB sync trigger
git diffTrack changes between revisions
git logFull revision history / audit trail
git blameContent provenance (which agent/operator wrote each line)
git bisectFind which revision caused an SEO regression
git stashPause mid-draft for higher-priority interrupt
git cherry-pickPull a great paragraph from an experimental branch

3. Architecture

3.1 Design Principles

  1. The filesystem is the source of truth for content. PostgreSQL is a materialized view for the web UI.
  2. Every content artifact is a file. Briefs are Markdown. Drafts are Markdown. Research notes are Markdown. Metadata is YAML. Reports are JSON.
  3. Git provides the content lifecycle. Branches are workstreams. Commits are revisions. Merges are review gates. Hooks are validation pipelines.
  4. The agent uses its native tools. No MCP wrappers for content operations. Read, Edit, Grep, Glob, git — the same tools it uses for code.
  5. The sync engine is invisible. Content appears in the web UI automatically. Changes from the web UI appear in files automatically.

3.2 Workspace Directory Structure

1~/.auctor/workspace/ # Git repository root 2├── .git/ # Git version control 34├── CLAUDE.md # Agent identity & instructions (exists today) 5├── memory/ # Observational memory (exists today) 6│ ├── observations.md 7│ ├── decisions.md 8│ └── learnings.md 910├── .auctor/ # System metadata & tooling 11│ ├── manifest.json # Content index: slug → {IDs, branch, sync state} 12│ ├── hooks/ # Git hooks (symlinked to .git/hooks/) 13│ │ ├── pre-commit # Validate staged content files 14│ │ ├── post-commit # Trigger filesystem → DB sync 15│ │ └── prepare-commit-msg # Auto-format commit messages 16│ ├── scripts/ # Validation & utility scripts 17│ │ ├── validate-seo.sh # SEO check runner 18│ │ ├── check-links.sh # Internal link verifier 19│ │ ├── readability.sh # Readability score calculator 20│ │ └── structure-check.sh # Template compliance checker 21│ ├── templates/ # Content structure templates 22│ │ ├── blog-post/ 23│ │ │ ├── brief.md 24│ │ │ └── draft.md 25│ │ ├── comparison/ 26│ │ │ └── draft.md 27│ │ ├── pillar/ 28│ │ │ └── draft.md 29│ │ └── how-to/ 30│ │ └── draft.md 31│ └── sync/ # Sync bookkeeping 32│ ├── last-sync.json # Per-piece last sync timestamps 33│ └── conflicts.json # Unresolved sync conflicts 3435├── strategy/ # Strategy layer 36│ ├── active.yaml # Current strategy directive 37│ └── archive/ # Historical strategies 3839├── plan/ # Content plan items 40│ └── <slug>/ 41│ └── plan.yaml # Plan item metadata 4243├── content/ # Active content pieces (the core) 44│ └── <slug>/ 45│ ├── brief.md # Brief with YAML frontmatter 46│ ├── research/ # Research artifacts 47│ │ ├── competitor-analysis.md 48│ │ ├── serp-analysis.md 49│ │ ├── keyword-data.json 50│ │ └── sources.md 51│ ├── draft.md # Content body (Markdown) 52│ ├── meta.yaml # SEO & publish metadata 53│ ├── reviews/ # Editor reviews 54│ │ ├── review-001.md 55│ │ └── review-002.md 56│ ├── validation/ 57│ │ └── seo-report.json 58│ └── assets/ 5960├── library/ # Published content (read reference) 61│ └── <slug>/ 62│ ├── article.md 63│ └── meta.yaml 6465└── competitors/ # Competitive intelligence 66 └── <domain>/ 67 ├── profile.yaml 68 └── pages/ 69 └── <path-slug>.md

3.3 The Content Piece as a Directory

A single content piece is a directory containing multiple files — analogous to a software module. The "source code" (draft.md), the "spec" (brief.md), the "test results" (validation/seo-report.json), the "code review" (reviews/review-001.md), and the "documentation" (research/).

Brief format — Markdown with YAML frontmatter:

1--- 2id: b7e4a2f0-1234-5678-9abc-def012345678 3planItemId: a3f2c1d0-1234-5678-9abc-def012345678 4status: approved 5primaryKeyword: "ai agents for content creation" 6contentType: blog_post 7publishProfile: 8 siteKey: consul 9 publishKind: article 10 schemaType: Article 11 categorySlug: ai-assistant 12 cta: create-assistant 13keywordMapping: 14 primary: "ai agents for content creation" 15 secondary: 16 - "ai content tools" 17 - "automated content pipeline" 18competitorDifferentiation: 19 - "Unlike generic AI writing tools, focus on the operator's decision-making role" 20voiceRequirements: 21 - "Confident operator voice — CEOs and founders, not marketers" 22 - "Concrete examples over abstract claims" 23internalLinks: 24 - /blog/what-is-an-ai-assistant 25 - /blog/content-strategy-for-startups 26--- 27 28# Content Brief: AI Agents for Content Creation 29 30## Angle 31 32The operator's perspective on why AI agents — not AI writing tools — represent the 33actual shift in content operations... 34 35## Outline 36 37### H2: The Content Operations Problem Nobody Talks About 38- Content teams don't have a writing problem — they have an operations problem 39- Keyword: "content operations ai" 40 41### H2: What Makes an Agent Different from an AI Writer 42...

Draft format — Markdown with minimal frontmatter, plus a separate metadata file:

draft.md:

1--- 2id: c9d1b3e0-1234-5678-9abc-def012345678 3briefId: b7e4a2f0-1234-5678-9abc-def012345678 4status: pending_human_review 5title: "AI Agents for Content Creation: The Operator's Guide" 6slug: ai-agents-for-content-creation 7--- 8 9# AI Agents for Content Creation: The Operator's Guide 10 11Most conversations about AI and content start in the wrong place...

meta.yaml:

1metaTitle: "AI Agents for Content Creation | Consul" 2metaDescription: "How AI agents transform content operations." 3excerpt: "The operator's guide to building an agentic content pipeline." 4primaryKeyword: "ai agents for content creation" 5faqBlocks: 6 - question: "Can AI agents replace content teams?" 7 answer: "No. Agents handle operations..." 8internalLinks: 9 - href: /blog/what-is-an-ai-assistant 10 anchorText: "AI assistant" 11imageMarkers: 12 - marker: "[IMAGE: Five-agent architecture diagram]" 13 description: "Flow diagram showing the pipeline" 14schemaGraph: 15 - "@type": Article 16 headline: "AI Agents for Content Creation: The Operator's Guide" 17publishedUrl: null 18publishedRecordId: null

This separation is the key advantage over the database model: each file can be edited independently. The agent can Edit the draft body without touching metadata. It can update SEO fields in meta.yaml without rewriting content. The Edit tool's surgical precision is fully leveraged.

3.4 Commit Message Convention

Structured messages create a machine-parseable, human-readable history:

1research(ai-agents): competitive analysis — 3 competitors, 12 gaps identified 2brief(ai-agents): initial brief — comparison angle, 4 H2s, 3 FAQ questions 3draft(ai-agents): first draft v1 — 2,847 words, 4 internal links 4review(ai-agents): editor cycle 1 — voice score 78/100, 3 critical edits 5draft(ai-agents): revision v2 — addressed critical edits, improved intro 6validate(ai-agents): SEO score 8/10 — meta description 2 chars over limit 7approve(ai-agents): human approved — operator added personal insight 8publish(ai-agents): pushed to consul.posts

3.5 Branch Strategy

  • main — Canonical published state. The web UI reads from here via sync. Merging to main is the content equivalent of deploying to production.
  • content/<slug> — In-progress content piece. Created when a plan item is approved. All work happens here. Merged to main when published.
  • experiment/<slug>/<variant> — Experimental alternative. Fork from a content branch, try a different angle, compare with git diff, merge the winner.
  • project/<name> — Coordinated multi-piece effort (pillar + cluster). All pieces developed together, cross-linked, published as a unit.
  • hotfix/<slug> — Quick corrections to published content (typo, broken link, outdated stat).

4. The Agent's New Capabilities

4.1 Surgical Content Editing

Before: The Writer generates an entire markdownBody field. To revise, it regenerates everything — even if one paragraph needs to change.

After:

1Edit("content/ai-agents/draft.md", 2 old: "Most conversations about AI and content start in the wrong place. They focus on...", 3 new: "Most conversations about AI and content start with the wrong question: can AI write? 4 The better question is: can AI operate?..." 5)

One paragraph changed. The rest of the 2,847-word article untouched. The Editor agent can directly Edit the draft — making the exact change it recommends, in place, with git diff showing precisely what it did.

4.2 Cross-Content Intelligence

Before: No tool searches across all content.

After:

1# Find every published article mentioning the competitor 2Grep("consul.ai", "library/") 3 4# Find all drafts with FAQ sections 5Grep("^### FAQ", "content/*/draft.md") 6 7# Find internal link opportunities 8Grep("content operations", "library/*/article.md") 9 10# Detect keyword cannibalization 11Grep("ai agents for content", "library/*/meta.yaml") 12 13# Find all TODO markers across in-progress content 14Grep("\\[IMAGE:|\\[TODO:", "content/*/draft.md")

Internal linking — one of the highest-impact SEO activities — becomes a trivial Grep instead of an impossible operation.

4.3 Full Revision History

Before: Current state only. Previous versions overwritten.

After:

1git log --oneline content/ai-agents/ 2→ a3f2c1d publish(ai-agents): pushed to consul.posts 3→ b7e4a2f validate(ai-agents): SEO score 8/10 4→ c9d1b3e draft(ai-agents): revision v2 5→ d2f5c4a review(ai-agents): cycle 1 — voice score 78/100 6→ e8a3d5b draft(ai-agents): first draft v1 7 8git diff e8a3d5b..c9d1b3e -- content/ai-agents/draft.md 9→ Exact line-by-line changes between first draft and revision v2

4.4 Branch-Based Experimentation

The operator says: "Try a more provocative take."

1git checkout -b experiment/ai-agents/provocative content/ai-agents 2# Edit the brief and draft with a different angle 3git diff content/ai-agents..experiment/ai-agents/provocative -- content/ai-agents/draft.md 4# Compare the two approaches, merge the winner

Impossible in the database model without duplicating rows, building custom diffing UI, and adding a "variant" concept to the schema.

4.5 Multi-Piece Content Projects

A pillar page strategy with 5 supporting articles — all developed on a single project/ branch. The agent Greps across all pieces to ensure consistent terminology, no keyword cannibalization, proper cross-linking. The project ships as a single merge to main.

4.6 Template Compliance

1git diff --no-index .auctor/templates/blog-post/draft.md content/ai-agents/draft.md

Shows exactly where the draft diverges from expected structure — missing sections, extra sections, wrong order. The template becomes a contract; the diff is the compliance check.

4.7 Emergent Properties

These arise naturally from git without any custom engineering:

  • Time travel: git checkout HEAD~5 -- content/ai-agents/draft.md
  • Blame: git blame content/ai-agents/draft.md — which session wrote each line
  • Bisect: Binary search through history to find which commit caused an SEO regression
  • Cherry-pick: Pull a great paragraph from an experimental branch
  • Stash: Pause mid-draft for a higher-priority interrupt
  • Hooks as middleware: Add any validation, notification, or transformation without touching agent code
  • Submodules: Shared glossary or brand voice across multiple sites

5. Sync Engine: Bridging Two Worlds

5.1 The Core Challenge

Content must exist in two places: on the filesystem for the agent's native tools, and in PostgreSQL for the web UI (calendar, content library, Kanban, dashboards, review interfaces).

This is solvable because of a crucial asymmetry: the agent and the operator operate on different aspects of content at different times. The agent produces content body, structure, metadata, and research. The operator makes approval decisions, sets schedule dates, assigns priorities, and gives inline feedback. These rarely overlap.

5.2 Source of Truth Split

ConcernAuthoritative Source
Content artifacts (briefs, drafts, research, reviews, validation)Filesystem
Operational state (approvals, schedule dates, workflow runs, cost accounting)Database

The agent produces and refines content → lives in files. The operator makes decisions → lives in the database. The database reflects current content state → synced from files. The files reflect current decisions → synced from database.

5.3 Outbound Sync (Filesystem → Database)

Triggered by the post-commit git hook:

1post-commit hook fires 23 ├── git diff HEAD~1 --name-only → list changed files 45 ├── For each changed content/<slug>/brief.md: 6 │ ├── Parse YAML frontmatter + Markdown body 7 │ └── Upsert auctor.content_briefs 89 ├── For each changed content/<slug>/draft.md + meta.yaml: 10 │ ├── Parse frontmatter, read meta.yaml 11 │ └── Upsert auctor.content_drafts 1213 ├── For each changed reviews/review-NNN.md: 14 │ └── Insert auctor.editor_reviews 1516 ├── For each changed validation/seo-report.json: 17 │ └── Insert auctor.seo_validation_runs 1819 └── Update .auctor/sync/last-sync.json

The hook delegates to a Next.js API endpoint (/api/content-engine/sync/outbound), which reads workspace files via AUCTOR_WORKSPACE_ROOT env var and upserts to the database. Incremental, idempotent, fast.

5.4 Inbound Sync (Database → Filesystem)

A polling daemon in the Electron main process checks for DB changes every 30 seconds:

1Operator approves brief in web UI 23 ├── PATCH /api/briefs/:id → updates DB 45 ├── Inbound daemon detects change (polling updatedAt columns) 67 ├── Updates content/<slug>/brief.md frontmatter: status → approved 89 └── git commit -m "sync(inbound): operator approved brief"

Handles a limited set of operations — primarily status changes, schedule dates, inline feedback — because those are what operators change through the web UI.

5.5 Conflict Resolution

  • Filesystem wins for content body changes (agent is the authoritative producer)
  • Database wins for operational state (operator is the authoritative decision-maker)
  • Conflicts are logged in .auctor/sync/conflicts.json for review
  • The agent reads the conflict log on its next cycle and adjusts

5.6 The Manifest

.auctor/manifest.json maintains the mapping between filesystem paths and database IDs:

1{ 2 "schemaVersion": 1, 3 "siteKey": "consul", 4 "lastFullSync": "2026-03-14T10:00:00Z", 5 "contentIndex": { 6 "ai-agents-for-content": { 7 "planItemId": "uuid-1", 8 "briefId": "uuid-2", 9 "draftId": "uuid-3", 10 "branch": "content/ai-agents-for-content", 11 "status": "pending_human_review", 12 "lastOutboundSync": "2026-03-14T10:00:00Z", 13 "lastInboundSync": "2026-03-14T09:45:00Z" 14 } 15 } 16}

6. Subagent Evolution

Each Harness subagent transitions from MCP-tool-dependent to filesystem-native:

SubagentDB-Only (Today)Filesystem-Native (Tomorrow)
ExplorerreadDomainSummary, readClusterSummaries, queryContentPlan MCP toolsRead("strategy/active.yaml"), Glob("plan/*/plan.yaml"), Grep across library/
ResearcherReturns data as tool results (ephemeral, lost after session)Writes content/<slug>/research/*.md as persistent, searchable, version-controlled artifacts
WriterformatForCms({ markdownBody: entireString }) — rewrites the whole bodyRead brief → Write draft → Edit for surgical revisions
EditorJSON edit suggestions in EditorReview rowsGrep for voice patterns, Read draft, Bash readability scripts, Write reviews, optionally Edit the draft directly
ValidatorMCP tool for SEO checkBash(".auctor/scripts/validate-seo.sh")Write validation results to seo-report.json

MCP tools are retained only for operations that remain database-only: external API calls (webSearch, keyword research), CMS upload, indexing API, and database-specific queries (extractions, graph edges) that don't have natural file representations.


7. Scheduler Cycles (Updated)

Morning

1Read memory files to re-orient on current priorities. 2 3Check workspace state: 4- git log --since=yesterday --oneline 5- git status 6- git branch (list active content branches) 7 8Check content pipeline: 9- Grep plan/*/plan.yaml for status: approved (items needing briefs) 10- Grep content/*/brief.md for status: pending_review 11- Grep content/*/draft.md for status: pending_human_review 12- Read .auctor/sync/conflicts.json for sync issues 13 14If approved plan items exist without briefs, start the highest-priority one: 15- Create content/<slug>/ from template 16- Research the topic (webSearch + competitor data from competitors/) 17- Write the brief 18 19Update memory/observations.md. Commit.

Midday

1Read memory and today's observations. 2 3Check progress on content branches: 4- git log --since="6 hours ago" --oneline 5- For each piece, read current status 6 7Look for anomalies: 8- Grep library/ for articles that may need updating 9- Read new .auctor/sync/conflicts.json entries 10 11If a draft is in revision, continue. If validation failed, fix issues. 12 13Write brief observations. Commit.

Evening

1Summarize today's work: 2- git log --since="12 hours ago" --stat 3- For each content piece touched, summarize accomplishments 4 5Log decisions in memory/decisions.md with specific commit references. 6Capture learnings in memory/learnings.md. 7Set tomorrow's priorities based on pipeline state. Commit.

Weekly

1Full weekly review: 2- git log --since='7 days ago' --stat 3- git shortlog --since='7 days ago' --no-merges 4- Count: pieces started, briefs completed, drafts completed, published 5- Measure: average revision cycles per piece 6 7Content performance: 8- Grep library/ for this week's publications 9- Check against competitors/ for differentiation 10 11Strategy review: 12- Read strategy/active.yaml 13- Compare against plan/ pipeline — are we tracking to strategy? 14 15Write weekly summary with concrete metrics from git history. 16Update strategy/active.yaml if adjustments needed. Commit.

8. What Stays in PostgreSQL

Not everything belongs in files. The following remain database-only:

DataWhy
agent_runtime_configConfiguration, not content. Read by Mastra at runtime.
agent_audit_logsHigh-frequency append-only logs. Git commits have too much overhead.
sync_runsWorkflow execution tracking. Ephemeral operational state.
api_cost_logFinancial data with aggregation queries. Relational is better.
system_eventsEvent queue with consumption semantics. Not a file pattern.
keyword_metricsEnriched metrics from external APIs. Updated programmatically.
gsc_performanceTime-series analytics. Relational queries essential.
page_snapshotsBinary/large HTML snapshots. Git isn't for staging data.
domain_rankingsTime-series data with complex queries.
Extracted intelligence (summaries, graph edges)Derived from content through LangExtract — computational, not authoring.

The line is clean: content artifacts (things the agent reads, writes, edits, searches) move to the filesystem. Operational data (things the system queries, aggregates, reports on) stays in the database.


9. Git Hooks: The Validation Pipeline

Pre-commit

Validates every staged content file before it can be committed:

  • YAML syntax validation on all .yaml files
  • Frontmatter presence check on brief.md and draft.md
  • Minimum word count warning on drafts
  • Internal link verification against library/

Failures block the commit. New validations are added by dropping a script into .auctor/scripts/ and calling it from the hook — no agent code changes, no MCP tool implementation, no API endpoint. The same extensibility model that powers CI/CD pipelines.

Prepare-commit-msg

Auto-detects content type from staged files and formats the commit message using the type(slug): description convention. Only activates for fresh commits (not amends or merges), and skips if the message already follows the convention.

Post-commit

Triggers outbound sync. Detects whether content-related files changed (anything under content/, plan/, library/, strategy/, competitors/), delegates to the sync endpoint, runs in the background so the commit returns immediately.


10. Build Plan

Phase 0: Foundation — Close Existing Gaps

Goal: Close gaps in the existing architecture before adding the filesystem layer.

  • MCP tool proxy route (frontend/app/api/content-engine/mcp-tool/route.ts): Accepts { tool, args }, dispatches to matching content-engine tool function, returns JSON. Unblocks desktop Claude Code sessions from using any content engine tool.
  • IPC stubs (desktop/src/ipc/handlers.ts): Implement WORKSPACE_LIST (recursive readdir) and WORKSPACE_READ (readFile with path validation). Enables renderer to browse workspace files.

Phase 1: Workspace Filesystem Structure

Goal: Create the directory layout, initialize git, install hooks, seed scripts and templates.

  • Extend ensure-dirs.ts with the full directory tree (replacing old workspace/drafts, workspace/briefs, workspace/sites/consul)
  • Git init in workspace root with .gitignore (excluding .auctor/sync/)
  • Symlink .auctor/hooks/ to .git/hooks/
  • Seed validation scripts (.auctor/scripts/validate-seo.sh, check-links.sh, readability.sh, structure-check.sh)
  • Seed content templates (blog-post, comparison, pillar, how-to)
  • Seed manifest.json and strategy/active.yaml from defaults

New file: desktop/src/bootstrap/seed-scripts.ts Modified: desktop/src/bootstrap/ensure-dirs.ts, desktop/src/workspace/templates.ts

Phase 2: Content Serializer

Goal: Bidirectional conversion between database row shapes and filesystem file shapes.

New file: desktop/src/sync/content-serializer.ts

Functions for each content type:

  • planItemToYaml / yamlToPlanItem
  • briefToFiles / filesToBrief
  • draftToFiles / filesToDraft
  • reviewToFile / fileToReview
  • seoReportToFile / fileToSeoReport
  • consulPostToFiles, competitorToFiles, strategyToYaml / yamlToStrategy
  • toSlug (kebab-case, deduped)

Verified by round-trip unit tests: serialize(deserialize(input)) === input for every type.

Dependency: yaml (js-yaml) added to desktop/package.json.

Phase 3: Data Population (DB → Filesystem)

Goal: One-time migration — read all existing content from the database and write it to the workspace.

  • New script desktop/src/sync/populate-workspace.ts — runs during bootstrap if manifest.lastFullSync === null
  • Fetches all plan items, briefs, drafts, reviews, SEO reports, consul posts, competitors, strategy via Next.js API
  • Writes each to the appropriate filesystem location using the serializer
  • Builds the manifest contentIndex
  • Creates initial git commit
  • Bulk-fetch GET endpoints added as needed (plan-items, briefs, consul-posts, competitors, strategy)

Hooked into: desktop/src/main.ts as Step 8.5 (after servers started, before file watcher).

Phase 4: Outbound Sync (Filesystem → Database)

Goal: When the agent commits content changes, push to PostgreSQL so the web UI stays current.

  • Post-commit hook (seeded by seed-scripts.ts): Detects changed content files, POSTs to sync endpoint
  • Sync endpoint (frontend/app/api/content-engine/sync/outbound/route.ts): Reads workspace files via AUCTOR_WORKSPACE_ROOT env var, parses with serializer, upserts to database
  • Route dispatches based on file path: plan/ → upsert plan items, content/*/brief.md → save brief, content/*/draft.md → save draft, etc.

Phase 5: Inbound Sync (Database → Filesystem)

Goal: When the operator changes things in the web UI, those changes appear in workspace files.

  • Daemon (desktop/src/sync/inbound-sync.ts): Polls every 30 seconds for DB changes since last check
  • Inbound API (frontend/app/api/content-engine/sync/inbound/route.ts): Returns changes since a given timestamp by querying updatedAt columns
  • Handles: brief status changes, draft status changes, plan item updates, draft body edits
  • Writes file updates and commits with sync(inbound): prefix

Hooked into: desktop/src/main.ts as Step 9.5, with stop() in shutdown.

Phase 6: Tool Policy & CLAUDE.md Update

Goal: Enable the agent to use native tools for content work.

  • Tool policy (tool-policy.ts): Remove Bash from disallowedTools. Add case-by-case evaluation — auto-allow git commands and .auctor/scripts/ execution; require approval for everything else.
  • CLAUDE.md template (templates.ts): Full rewrite documenting the filesystem workflow — directory structure, file formats, content lifecycle as file operations, validation scripts, git conventions, commit message format. Retains MCP tool references for database-only operations.
  • Runtime manager (runtime-manager.ts): Remove 'Bash' from disallowedTools array.

Phase 7: Scheduler Prompt Updates

Goal: Make autonomous cycles filesystem-native.

Rewrite all four cycle prompts in scheduler.ts to use git log, git status, git branch, Grep across content directories, and .auctor/scripts/ validation — as described in Section 7.

Phase 8: Git Hooks — Pre-commit Validation

Goal: Automated quality gates on every commit.

  • Pre-commit hook: YAML validation, frontmatter checks, word count warnings
  • Prepare-commit-msg hook: Auto-format commit messages using type(slug): description convention

Both seeded by seed-scripts.ts.

Phase 9: Integration & End-to-End Testing

Three manual E2E tests:

  1. Agent creates content: Plan item → scaffold from template → research → write brief → commit → outbound sync → appears in web UI → operator approves → inbound sync → agent reads approved status → starts drafting
  2. Agent edits existing content: Read draft → surgical Edit → run validation script → commit → verify git diff shows minimal change → verify DB reflects update
  3. Cross-content search: Agent working on new article → Grep library for internal link opportunities → adds links to meta.yaml → verify links reference real published content

Dependency Order

1Phase 0 (MCP proxy + IPC stubs) ← no dependencies, unblocks everything 23Phase 1 (directory structure + git) ← no dependencies 45Phase 2 (content serializer) ← needs Phase 1 67Phase 3 (data population) ← needs Phase 1 + Phase 2 89Phase 4 (outbound sync) ← needs Phase 2 + Phase 3 1011Phase 5 (inbound sync) ← needs Phase 4 1213Phase 6 (tool policy + CLAUDE.md) ← needs Phase 1 1415Phase 7 (scheduler prompts) ← needs Phase 6 1617Phase 8 (git hooks) ← needs Phase 1; parallelizable with Phases 4–7 1819Phase 9 (E2E testing) ← needs everything

Phases 0 and 1 can run in parallel. Phase 8 can run alongside Phases 4–7.


11. File Inventory

New Files

FilePhasePurpose
frontend/app/api/content-engine/mcp-tool/route.ts0MCP tool proxy
frontend/app/api/content-engine/sync/outbound/route.ts4Outbound sync endpoint
frontend/app/api/content-engine/sync/inbound/route.ts5Inbound sync endpoint
frontend/app/api/content-engine/plan-items/route.ts3List plan items (GET)
frontend/app/api/content-engine/briefs/route.ts3List briefs (GET)
frontend/app/api/content-engine/consul-posts/route.ts3List consul posts (GET)
frontend/app/api/content-engine/competitors/route.ts3List competitors (GET)
frontend/app/api/content-engine/strategy/route.ts3Get strategy (GET)
desktop/src/sync/content-serializer.ts2DB ↔ file conversion
desktop/src/sync/content-serializer.test.ts2Serializer round-trip tests
desktop/src/sync/populate-workspace.ts3Initial data population
desktop/src/sync/inbound-sync.ts5DB → file daemon
desktop/src/bootstrap/seed-scripts.ts1Validation scripts, hooks, templates

Modified Files

FilePhaseChange
desktop/src/bootstrap/ensure-dirs.ts1New directory structure, git init, hook symlinks
desktop/src/workspace/templates.ts1, 6Template constants, CLAUDE.md rewrite
desktop/src/main.ts3, 5Add populate step, add inbound sync daemon
desktop/src/ipc/handlers.ts0Implement WORKSPACE_LIST, WORKSPACE_READ
desktop/src/claude/tool-policy.ts6Restricted Bash evaluation
desktop/src/claude/runtime-manager.ts6Remove Bash from disallowedTools
desktop/src/workspace/scheduler.ts7Rewrite all 4 cycle prompts
desktop/package.json2Add yaml dependency

12. Risks & Mitigations

RiskMitigation
Two-way sync complexityClean split: filesystem-authoritative for content, DB-authoritative for operational state. Mostly non-overlapping concerns. Fallback: filesystem as sole source of truth with read-only DB.
Git repo sizeText-only content stays small (10k articles × 3k words ≈ 30MB). .gitignore large assets. git-lfs for binaries. Periodic archival of old branches.
Concurrent agent writesScheduler already has conflict avoidance (checks for active sessions, defers 15 min). Extend to file-level awareness via git status before sessions.
Web UI latencyPost-commit hook runs synchronously — DB updated within seconds. Inbound sync can use Supabase Realtime for near-instant propagation.
Operator learning curveWeb UI abstracts git. Operator sees "Review Draft" not "Review PR," "Approve" not "Merge," "View History" not git log. Git is infrastructure, not interface.

13. The Deeper Thesis

The argument is not merely "content should live in files." It is a stronger claim: the optimal architecture for agentic content creation is one where the agent operates in its native medium.

Claude Code was built to work with files under version control. Its Read, Edit, Grep, Glob tools are the product of intensive optimization for file-based workflows. Its git integration provides versioning, branching, diffing, and collaboration for free. Its Bash access enables arbitrary scripting and validation.

When content is forced through database abstractions, it is the equivalent of asking a master carpenter to work with oven mitts on. The skills are there. The tools are there. But the interface between the agent and the material strips away every advantage.

Content-as-Code removes the oven mitts. Content becomes the same kind of artifact the agent already knows how to work with. Every file operation that makes Claude Code exceptional at software engineering transfers directly to content creation. And the emergent properties of git — time travel, blame, bisect, branching, hooks — create a content operations infrastructure that no purpose-built CMS can match.