Auctor Customer-Facing Productization Plan, Deep Version

Summary

Auctor already has a real product kernel: ingest research, extract structure, track keywords and competitors, generate strategy, create briefs and drafts, validate SEO, publish, automate, and monitor outcomes.
The current implementation is still an internal operator stack, not a customer-safe SaaS. The biggest blockers are not feature gaps; they are trust boundaries, tenancy, security, connector architecture, onboarding, billing, and supportability.
The launch recommendation is a phased B2B SaaS conversion: dedicated Auctor infrastructure, shared multi-tenant product architecture with strict RLS and RBAC, connector-based publishing, human approval before publish in beta, and web app only for customer-facing v1.
“Customer-facing ready” should mean all of the following are true: dedicated production project, real auth and memberships, workspace/site tenancy, RLS on every customer table, no service-role access in request paths, connectorized publishing, quota enforcement, support/admin tooling, auditability, and compliance baseline.

Current-State Diagnosis

The auctor schema is already valuable and unusually strong for a pre-product internal tool. It covers ingestion, research memory, keyword intelligence, strategy, briefs, drafts, validation, audit, and sync history.
The current Supabase environment is the wrong boundary for external users. One live project hosts auctor, consul, and generative, which is acceptable for internal experimentation and not acceptable for external production.
Every auctor table currently has RLS disabled. That alone disqualifies the existing project from customer traffic.
Server-side data access is broadly privileged in supabase.ts, which means the runtime assumes trusted operators rather than hostile or isolated tenants.
Product defaults are hardcoded to consul in defaults.ts, and the same assumption appears in harness state and scheduled jobs. That is a single-customer posture disguised as a generic product.
Publishing is not truly CMS-agnostic yet. The current write path in repositories.ts writes directly into consul.posts and related consul tables rather than going through a destination connector abstraction.
Integration secrets are effectively deployment-global, not workspace-scoped. The current model is appropriate for one operator-owned environment and not for multiple customers with separate credentials, rotations, or audit trails.
The FastAPI service is only a small extraction sidecar with local CORS assumptions. The real product backend today is Next.js plus Supabase; the Python service is not the customer platform.
The UI is an operator cockpit. It exposes powerful internal surfaces such as raw model settings, MCP settings, DB checks, and desktop-oriented workflows without the separation expected in a customer product.
Release hardening is incomplete. TypeScript build errors are ignored, operational gates are weak, and there is no customer-grade rollout, support, or incident posture.

Product Definition and Launch Posture

v1 should target design-partner B2B customers: in-house content teams, SEO teams, growth teams, and agencies managing one or more sites per client.
The tenancy hierarchy should be organization -> workspace -> site, where one organization may own multiple workspaces and each workspace may manage multiple sites, locales, and publishing targets.
The customer-facing product should be the web app only. Desktop runtime, workspace-sync, and direct MCP access should remain internal or admin-gated until they have tenant-safe authentication, quotas, and support boundaries.
The initial value proposition should be “content operations system for research-backed SEO and AI-visibility content,” not “general-purpose CMS automation platform.” That framing keeps the surface coherent.
Beta should require human approval before external publishing. Auto-publish can exist only as a per-workspace opt-in after approval workflow, rollback, and audit metrics are proven.
The product should support both “assisted” mode and “autonomous” mode, but autonomous mode in v1 beta should stop at draft generation and recommendations, not unreviewed external publish.
Source systems and destination systems should be modeled separately. Analytics and discovery connectors are not the same thing as CMS publishing connectors and should not share the same abstraction.
The first commercial milestone should be “one workspace can onboard, sync data, generate a plan, create a brief, create a draft, validate it, review it, and publish to a supported CMS with full audit history.”
The customer information architecture should be organized around setup, content pipeline, performance, and administration, not around internal runtime concepts.

Infrastructure and Environment Architecture

Create a brand-new dedicated Supabase project for Auctor production. Do not serve customer traffic from the mixed internal project.
Create separate dev, staging, and prod environments with independent databases, storage buckets, auth settings, webhooks, and secrets. Staging should be production-shaped, not a reduced sandbox.
Make Postgres the system of record for product state, job state, audit state, entitlements, and connector state. Do not split core customer truth across desktop files and DB state.
Keep object storage separate from structured data. Store page snapshots, extraction artifacts, logs, uploaded assets, prompt traces, and exported reports in storage with explicit retention policies.
Standardize secrets management outside request-time environment mutation. Credentials should be encrypted at rest, versioned, rotated, and loaded only for the specific job or request that needs them.
Define one control plane and one execution plane. The control plane is Next.js plus authenticated APIs; the execution plane is background workers plus the internal extraction service.
Use Trigger.dev as the v1 orchestration layer for scheduled and long-running jobs because it already exists in the stack and provides retries, concurrency control, and audit-friendly run history.
Keep the Python extraction service as a private dependency behind service-to-service authentication. It should not be publicly reachable as a customer API.
Add backup, restore, and disaster recovery procedures before beta. This includes database restore drills, storage restore validation, and documented RTO/RPO targets.
Add a real observability stack before beta: structured logs, traces, metrics, job run dashboards, error tracking, and workspace-aware correlation IDs.

Identity, Authorization, and Tenant Model

Use Supabase Auth for v1 identity. Support email login plus Google sign-in first; defer SAML/SCIM until enterprise demand justifies it.
Add first-class tables for organizations, workspaces, sites, memberships, roles, invites, api_keys, and support_sessions.
Define roles explicitly: owner, admin, operator, editor, reviewer, viewer, and billing_admin. Internal support roles should be separate and never stored as normal customer memberships.
Make workspace membership the root authorization primitive. Site access should derive from workspace membership plus optional site-level restrictions for larger accounts.
Add invite acceptance, seat management, transfer of ownership, and offboarding flows. Customer-facing team management is non-optional once the product leaves internal use.
Add scoped API keys for machine access. Keys should support permissions like read-only analytics, content creation, publish execution, or connector administration.
Every request and job should resolve a single AuthContext that includes actor identity, organization, workspace, site scope, role, entitlements, and impersonation metadata if applicable.
Add approval policies as a first-class authorization layer. Publishing, connector changes, billing changes, and auto-run activation should be separately permissioned even for users with broad workspace access.
Implement safe support impersonation for internal staff. It should require explicit reason capture, short-lived sessions, audit logs, and visual indication in the UI.
Make every customer-visible mutation auditable with who, when, what changed, previous value, and source fields.

Data Model Refactor

Preserve the existing auctor operational model wherever possible. The goal is not to rebuild all content tables; it is to add a real tenant boundary and customer-safe ownership semantics around them.
Add organization_id, workspace_id, and site_id where appropriate across all customer-owned rows. site_key should become a subordinate site identifier, not the top-level tenant boundary.
Convert site_targets into workspace-owned sites with domain, locale, publishing defaults, editorial defaults, analytics connections, and health status.
Keep documents, document_extractions, summary_nodes, and graph_edges as the research memory layer, but scope them to workspace and site so one customer’s research corpus can never intersect another’s.
Keep tracked_keywords, keyword_metrics, domain_rankings, serp_snapshots, ai_visibility_data, and gsc_performance as the performance intelligence layer, again scoped to workspace and site.
Keep content_plan_items, content_briefs, content_drafts, editor_reviews, and seo_validation_runs as the content lifecycle layer, but add explicit ownership, lifecycle status, approval state, and provenance fields.
Keep agent_runtime_config, strategy_directives, sync_runs, agent_audit_logs, and system_events as the execution and audit layer, but make them tenant-aware and safe to expose selectively in customer reporting.
Add new root tables for provider_connections, source_sync_jobs, publish_targets, content_type_mappings, field_mappings, taxonomy_mappings, publish_runs, publish_run_events, external_records, usage_events, subscriptions, and entitlements.
Add created_by, updated_by, archived_at, deleted_at, and immutable run provenance on major business entities. Customer-facing products need history and safe restore semantics.
Add explicit workflow statuses across the content lifecycle. The minimum should cover planned, briefing, drafting, review, approved, publish pending, published, failed, and archived.
Add explicit lineage fields so every brief and draft can point back to its source documents, keywords, competitors, strategies, prompts, model configuration, and connector mapping version.
Use SQL foreign keys aggressively for identity, billing, approvals, and configuration tables. Use application-managed references only for very high-volume research and metric fact tables where FK churn or retention management would be costly.
Add composite indexes keyed by tenant boundary first. Typical access patterns should be workspace_id + site_id + status + updated_at, not generic global timestamp scans.
Add retention and archiving policy to high-volume fact tables and artifacts. Not every raw snapshot or prompt trace needs infinite retention.

Customer Configuration System

Build a formal inheritance model: organization defaults, workspace defaults, site overrides, content-type overrides, and per-run overrides with audit history.
Add a brand_profile layer containing voice, audience, positioning, claim boundaries, blocked phrases, preferred CTAs, content goals, and editorial constraints.
Add an editorial_policy layer containing citation requirements, link policies, author rules, content review requirements, disclaimers, legal constraints, and quality bars.
Add a generation_policy layer containing default models, fallback models, token/spend budgets, temperature-like controls, run concurrency, and allowed automation behaviors.
Add a publishing_policy layer containing approval requirements, publish windows, draft-vs-live rules, slug policies, canonical URL rules, and rollback behavior.
Add a site_profile layer containing locale strategy, taxonomy defaults, schema markup defaults, preferred content types, internal linking rules, and allowed destinations.
Add content-type templates for blog posts, landing pages, listicles, comparisons, guides, and custom site-defined types. These templates should drive both generation behavior and connector field mapping.
Version every configuration object and make the active version explicit in run history. Customers need reproducibility when a brief or draft quality changes.
Add configuration validation and health checks. The system should tell the customer which settings are incomplete, contradictory, or blocking publish.
Make configuration import/export possible at the workspace level. Agencies and multi-brand customers will want to duplicate good setups across workspaces.
Keep internal “agent knobs” out of customer setup. Customers should configure business outcomes and publishing targets, not raw internal runtime internals.

Runtime, Jobs, and AI Execution

Split execution into four planes: Next.js control plane, Trigger.dev worker plane, internal extraction plane, and webhook plane.
Every background action should use a typed JobEnvelope with organization_id, workspace_id, site_id, actor_id, idempotency_key, request_source, priority, and entitlement snapshot.
Define first-class jobs for source sync, crawl, extraction, summary rebuild, keyword refresh, strategy generation, brief generation, draft generation, SEO validation, publish, reconciliation, and usage aggregation.
Make jobs idempotent and replayable. A retry must not create duplicate drafts, duplicate CMS entries, or duplicate usage billing.
Add workspace-level concurrency limits, queue prioritization, and cancellation controls so one noisy workspace cannot starve others.
Add run artifacts for every job: prompt or config version, upstream inputs, model used, tokens used, external API calls, output hashes, and final result status.
Add cost tracking at the run level. AI spend, crawl volume, asset storage, and publish actions should all emit usage events in the same execution path that did the work.
Add human approval checkpoints between high-risk stages. Beta defaults should require approval before publish and optionally before draft generation for sensitive workspaces.
Add graceful degradation for provider outages. If a source connector is down, publishing should still work; if a model fails, the run should degrade to fallback models or fail cleanly with preserved state.
Add dead-letter handling and replay tooling for jobs that fail repeatedly. Operators need a safe recovery surface without touching the database directly.
Add run-level explainability surfaces so customers can inspect why the system made a recommendation or generated a draft, without exposing low-level internal noise.

Source and Destination Connector Platform

Separate connector families into source connectors and destination connectors. The first bring data into Auctor; the second publish or synchronize content out of Auctor.
Ship source connectors first for sitemap/robots ingestion, CSV import, Google Search Console, and GA4. These cover the minimum viable research and reporting loop.
Ship destination connectors first for WordPress and Webflow. These are the best initial balance of market demand, implementation complexity, and publish semantics.
Plan second-wave destination connectors for Contentful and Sanity. These require stronger structured-content and locale support and should come after the core mapping system is proven.
Defer Shopify, HubSpot, Ghost, and generic custom HTTP to later phases. Do not promise “all major CMSs” before the mapping model and reconciliation model are stable.
Define a canonical ContentDocument model as the publishing intermediate representation. It should cover title, slug, dek, excerpt, hero image, body blocks, SEO title, SEO description, canonical URL, authors, categories, tags, locale, publish_at, schema data, references, CTA blocks, and custom fields.
Define a canonical rich-content representation instead of storing only HTML. Connectors should serialize from a structured content AST into HTML, Gutenberg blocks, Webflow rich text, Contentful rich text, or Sanity portable text as needed.
Add PublishTarget as the customer-facing abstraction for a destination instance. A publish target is a connector instance plus content model mappings plus credentials plus health state.
Add ContentTypeMapping so a workspace can map Auctor content types to provider-specific destinations such as WordPress post types, Webflow collections, Contentful content types, or Sanity document types.
Add FieldMapping with a constrained mapping DSL. It should support canonical field source, provider field destination, default values, transforms, validation, required/optional flags, and preview output.
Add TaxonomyMapping and AuthorMapping so categories, tags, authors, and references can be resolved before publish rather than guessed in-flight.
Add AssetMapping and upload policy so featured images, inline assets, OG images, and downloadable files have deterministic upload, reuse, and fallback behavior.
Support locales as a first-class connector concern. Auctor should know whether the destination expects separate entries, localized fields, or locale-bound publishing environments.
Add connector capability flags such as supports_drafts, supports_scheduled_publish, supports_assets, supports_taxonomies, supports_locales, supports_webhooks, and supports_rollback.
Every connector should implement the same lifecycle: connection test, metadata sync, mapping configuration, dry-run preview, draft create, draft update, publish, unpublish if supported, reconciliation, and webhook verification.
Every publish should create a publish_run with external IDs, payload hashes, response summaries, and recovery hints. This is the audit source for support and customer trust.
Reconciliation is mandatory. Auctor must periodically confirm the external system still matches its last known draft or published state and surface drift if a customer edits content directly in the CMS.
Connector-specific edge cases should be handled in connector adapters, not in core content logic. Examples include WordPress custom fields and SEO plugins, Webflow staged publish behavior, Contentful environments and locales, and Sanity document references.
Add a generic custom HTTP connector only after the canonical content and mapping contract is stable. It should be framed as an advanced integration, not the primary product path.
Build a guided connection wizard for every connector. It should validate credentials, fetch destination metadata, suggest default mappings, and highlight missing required fields before the customer can publish.
Add a publish preview that shows the transformed outbound payload and the destination field mapping result before anything is written to the external CMS.
Add rollback behavior where the connector supports it. Where rollback is impossible, provide compensating actions and explicit operator instructions.

Customer UX, Reporting, and Workflow Design

Rebuild the navigation around customer jobs-to-be-done: onboarding, sites, research, strategy, content pipeline, publishing, performance, team, billing, and workspace settings.
Add a first-run onboarding flow that creates the workspace, invites teammates, registers the first site, connects source data, connects the first CMS, imports initial keywords, adds competitors, and launches the first sync.
Add a site setup checklist and health score so customers can see what is incomplete before the system can produce reliable recommendations or publish safely.
Keep the current strong surfaces for library, competitors, keywords, strategy, briefs, drafts, editor review, and activity, but reframe them for customer comprehension rather than operator internals.
Add explicit approval workflow UI. Customers should be able to require reviewer signoff, editor signoff, or owner signoff depending on the workspace policy.
Add workspace dashboards that explain pipeline status, publish readiness, connector health, keyword coverage, draft throughput, and recent failures in customer language.
Add performance reporting that ties the full loop together: source insight, strategy decision, generated asset, publish event, ranking movement, AI visibility change, and traffic impact where data exists.
Add provenance and explainability surfaces. Customers should be able to see why a brief exists, which documents informed a draft, which keywords were targeted, and which validation checks passed or failed.
Hide internal-only settings such as raw MCP configuration, internal DB checks, and low-level model harness settings behind admin or feature flags. Customers should not see operator tools.
Add notifications for failed jobs, connector issues, approval requests, publish completions, and important performance changes. Email first is sufficient for beta; Slack can follow.
Add customer-facing help, inline setup guidance, and support entry points inside the app. Productizing the system includes making it legible, not only making it secure.
Keep the desktop workflow as an internal augmentation layer for operators, not a customer requirement. Customers should not need a local runtime to get value.

Billing, Quotas, Abuse Controls, and Commercial Readiness

Use Stripe for v1 billing because it is the fastest path to subscriptions, invoices, payment methods, and usage-based metering.
Define plans around seats, tracked keywords, crawl volume, generated briefs and drafts, publish actions, storage, and optional AI spend ceilings.
Add subscriptions, entitlements, and usage_events as first-class domain objects, not ad hoc counters.
Emit usage events in the same transactional path as the underlying job or mutation so billing reflects actual work and can be audited.
Enforce soft limits and hard limits separately. Soft limits warn and surface upgrade paths; hard limits block execution that would exceed contract or abuse boundaries.
Add workspace-level spend controls so owners can cap model spend, disable certain premium models, or restrict automation breadth.
Add trial and onboarding quotas explicitly. Trial workspaces should be able to experience the product without opening abuse holes.
Add seat management, billing admin permissions, invoice history, and payment method management in the product.
Add abuse controls on connector tests, publish attempts, generation loops, and high-volume ingestion so a single workspace cannot create runaway cost or platform instability.
Support agencies via multi-workspace organizations rather than white-labeling in v1. White-label and advanced reseller packaging can wait.

Security, Compliance, Support, and Operations

Enable RLS on every customer-facing table before beta and build policy templates around workspace membership plus role checks.
Remove service-role clients from normal request paths. Service-role access should be restricted to internal workers, migrations, and admin tooling that is separately authenticated.
Replace deployment-global integration secrets with encrypted provider_connections scoped to workspace or site. Every secret should carry type, status, scope, key version, last validated time, and rotation history.
Never write customer credentials into process.env at runtime. Credentials should be decrypted only for the request or job that needs them and discarded afterward.
Redact prompts, tokens, credentials, and external payload secrets from logs by default. Customer-facing products cannot rely on “be careful with logs” as a policy.
Add security event logging for login, invite acceptance, API key creation, connector changes, publish attempts, role changes, and impersonation.
Add customer data export and deletion workflows. GDPR and CCPA readiness is easier if the data model is designed for it early.
Add retention policies for raw snapshots, prompt traces, and run artifacts so the product can meet privacy expectations and control storage cost.
Publish a baseline trust package before launch: privacy policy, terms, DPA template, subprocessor list, support policy, incident communication policy, and backup posture.
Add vulnerability management, dependency scanning, and pre-release penetration testing before GA.
Build an internal support console with tenant search, workspace health, run replay, webhook replay, connector diagnostics, feature-flag overrides, and safe impersonation.
Define operational SLOs for app availability, job latency, publish success rate, and connector health. Productization is not complete until the team can measure operational quality.

Public APIs, Interfaces, and Types

Add AuthContext, WorkspaceContext, and SiteContext as required inputs to all domain services and repositories. No repository should depend on a hidden global default site.
Add first-class types for Organization, Workspace, Site, Membership, Invite, Role, ApiKey, Subscription, Entitlement, and UsageEvent.
Add configuration types for BrandProfile, EditorialPolicy, GenerationPolicy, PublishingPolicy, SiteProfile, and ContentTypeTemplate.
Add integration types for ProviderConnection, SourceConnector, PublishingConnector, PublishTarget, ContentTypeMapping, FieldMapping, TaxonomyMapping, AuthorMapping, and ExternalRecord.
Add execution types for JobEnvelope, RunArtifact, ApprovalPolicy, ApprovalRequest, PublishRun, PublishRunResult, and ReconciliationResult.
Define the canonical ContentDocument interface once and make every generation flow and every publishing connector use it as the intermediate payload.
Expose authenticated BFF endpoints for workspaces, sites, memberships, invites, connections, source sync, content lifecycle, approvals, publishing, usage, and billing.
Keep worker-only endpoints separate from customer-facing APIs. Internal job control, extraction callbacks, and webhook ingestion should not share the same contract as the customer UI.
Add webhook verification contracts per connector and store signed event receipts for replay and audit.
Keep direct MCP access out of the public API surface in v1. If machine access is needed, expose it through scoped API keys and bounded endpoints first.

Delivery Sequence

Freeze further internal-schema drift, document the current consul assumptions, and define the canonical target model so engineering stops deepening the single-customer posture.
Create dedicated Auctor infrastructure for dev, staging, and prod, including auth, storage, secrets, observability, backups, and worker deployment.
Implement identity and tenancy first: organizations, workspaces, sites, memberships, invites, roles, API keys, and AuthContext.
Refactor the data model to add tenant ownership, configuration objects, connector objects, usage objects, and approval objects while preserving the strong existing content tables.
Turn on RLS and remove service-role usage from request paths. Do this before expanding customer-facing routes or onboarding flows.
Build the configuration system and approval model so customers can define brand, site, editorial, generation, and publishing behavior without touching internal knobs.
Split runtime into control plane and worker plane, move all heavy or retriable work to jobs, and make every run idempotent and cost-accounted.
Build the connector platform and ship WordPress plus Webflow first, including mapping UI, dry-run preview, publish runs, and reconciliation.
Rebuild the customer-facing UX around onboarding, sites, pipeline, approvals, performance, team management, and billing; hide internal-only surfaces.
Add Stripe billing, quota enforcement, support tooling, compliance baseline, incident readiness, and launch gates.
Migrate internal data into one or more seed workspaces for the team’s own use, but onboard external design partners only into the new dedicated environment.
Run a design-partner beta with manual publish approvals, measure failure modes, and expand connector coverage only after the core platform proves stable.

Test Plan

Verify cross-tenant isolation at three layers: API authorization, repository scoping, and Postgres RLS.
Verify role matrix behavior for every major action: connector admin, generation, review, approval, publish, billing, and invite management.
Verify onboarding end-to-end: create workspace, invite teammate, add site, connect GSC, connect CMS, import keywords, sync data, generate strategy, create brief, create draft, validate, approve, publish.
Verify connector contract conformance across all supported destinations: connection test, metadata sync, mapping preview, draft create, draft update, publish, reconciliation, and webhook verification.
Verify publish idempotency so repeated retries cannot create duplicate CMS records or duplicate usage billing.
Verify source sync idempotency so repeated imports do not duplicate keywords, snapshots, or documents.
Verify mapping correctness for canonical fields, custom fields, taxonomies, authors, assets, locales, and fallback values.
Verify approval workflow transitions and blocking behavior so no unauthorized publish can bypass approval state.
Verify quota enforcement and spend caps across generation, crawling, syncing, and publishing.
Verify failure recovery for model timeouts, connector outages, network failures, webhook delays, and partial external writes.
Verify audit coverage so every sensitive mutation and support action is captured with actor, timestamp, and before/after details.
Verify migration and backfill logic from the current internal environment into seed workspaces without orphaned rows or broken lineage.
Verify data export and deletion paths for workspace-level DSAR scenarios.
Verify performance under concurrent multi-tenant load for crawling, generation, publish bursts, and dashboard reads.
Verify restore procedures from database backup and storage backup in staging before production beta.
Verify that TypeScript build, lint, tests, migrations, and connector contract suites all pass as required release gates.

Assumptions and Defaults

v1 is a shared multi-tenant SaaS on a dedicated Auctor project, not a dedicated database per customer.
Dedicated single-tenant deployments are an enterprise follow-on, not a launch requirement.
Supabase Auth is the default identity system for v1; SAML and SCIM are deferred.
Trigger.dev is the default job orchestration layer for v1.
The Python extraction service remains an internal worker dependency rather than becoming the public product backend.
The initial customer-facing surface is the web app only; desktop runtime, inbound workspace sync, and direct MCP remain internal/admin-only.
Beta requires human approval before external publish by default.
WordPress and Webflow are the first destination connectors; Contentful and Sanity are second-wave connectors.
Sitemap/robots, CSV import, Google Search Console, and GA4 are the first source connectors.
The existing strong auctor operational schema is preserved and tenantized rather than replaced wholesale.
The current direct consul publish path is retired after connector-based publishing is in place and internal data is migrated.