Executive Summary
This branch replaces a multi-agent orchestrator architecture (20 agents, 113 tools, 13 workflows) with a single-agent-with-direct-tools architecture. The core thesis: instead of an LLM deciding which sub-agent to route to (an extra hop costing latency and tokens), all ~96 tools are loaded directly onto one agent, with a processor pipeline handling safety, policy, and intelligence concerns.
Key outcomes:
- 10 specialized agents deleted, replaced by 1 unified agent factory
- 7 workflows deleted (4 HITL + 3 orchestration), replaced by tool-level suspend/resume
- 2,904-line monolithic messaging router decomposed into 9-stage pipeline
- 12 new input/output processors for safety, dedup, compaction, and error recovery
- 19 composable prompt sections with channel-aware rendering
- 4-layer cascading tool policy pipeline
- Per-tool mutation fingerprinting for dedup
Table of Contents
- Agent Architecture
- Tool System
- Processor Pipeline
- Prompt Architecture
- Middleware
- Messaging Gateway
- Deleted Workflows & Services
- Web App Changes
- New Documentation
- Tone & Voice Refinements
1. Agent Architecture
Before (main): 20 agents
| Agent | Model | Purpose |
|---|---|---|
webOrchestratorAgent | gpt-4.1-mini | Central router for web chat, delegated to 7 sub-agents + 4 HITL workflows |
imessageOrchestratorAgent | gpt-4.1-mini | Central router for iMessage/SMS |
planningAgent | gpt-4o | ReWOO-style task decomposition |
validatorAgent | gpt-4o | Maker-Checker result validation |
analysisAgent | gpt-4o-mini | Content extraction/classification |
googleContactsAgent | gpt-4o-mini | 9 contact tools |
googleDocsAgent | gpt-4o-mini | 13 doc tools |
googleDriveQueryAgent | gpt-4o-mini | 6 drive read tools |
googleDriveActionAgent | gpt-4o-mini | 14 drive write tools |
slackQueryAgent | gpt-4o-mini | 7 slack read tools |
slackActionAgent | gpt-4o-mini | 6 slack write tools |
imessageAgent | gpt-4o-mini | 2 iMessage tools |
reminderAgent | gpt-4o-mini | Reminder tools |
| + 7 retained agents | various | email orchestrator, gmail, calendar, scheduling, triage, sales, onboarding |
After (branch): 10 agents
| Agent | Model | Purpose |
|---|---|---|
consulAgent (NEW) | gpt-4.1 | Unified web chat agent, ~96 tools loaded directly |
imessageConsulAgent (NEW) | gpt-4.1-mini | iMessage/SMS variant, same factory |
emailOrchestratorAgent | (kept) | Inbound email triage routing |
gmailQueryAgent | (kept) | Used by email orchestrator |
gmailActionAgent | (kept) | Used by email orchestrator |
googleCalendarQueryAgent | (kept) | Used by email orchestrator |
googleCalendarActionAgent | (kept) | Used by email orchestrator |
schedulingAgent | (kept) | Scheduling workflow |
emailTriageAgent | (kept) | Email triage workflow |
salesAgent | (kept) | Sales processing |
onboardingDemoAgent | (kept) | New user onboarding |
Unified Agent Factory (agents/consul-agent.ts, 253 lines)
createConsulAgent(channel) generates both variants from a single code path:
| Setting | Web | iMessage |
|---|---|---|
| Model | gpt-4.1 (fallback: gpt-4.1-mini) | gpt-4.1-mini (fallback: gpt-4o-mini) |
| Memory | lastMessages: false | lastMessages: 10 |
| maxSteps | 10 | 8 |
| autoResumeSuspendedTools | false (UI buttons) | true (auto-resume) |
| Extra processors | -- | ConfirmationGate, SendOnceGuard |
| Compaction thresholds | 0.70 / 0.85 | 0.55 / 0.70 |
| Semantic recall | topK: 3, messageRange: 2 | topK: 2, messageRange: 1 |
| Observational memory | 30k / 50k tokens | 15k / 30k tokens |
Memory: Three-tier system:
- Working memory (structured user preferences, resource-scoped)
- Semantic recall (cross-thread RAG via LibSQL vector)
- Observational memory (auto-summaries via gpt-4.1-nano, resource-scoped)
2. Tool System
+5,282 / -2,866 lines across 34 files in tools/
New Infrastructure
| Component | File(s) | Purpose |
|---|---|---|
| Tool Registry | tools/registry.ts (261 lines) | Single source of truth — flat map of all ~96 tools |
| Tool Index | tools/index.ts (101 lines) | Public API: exports registry, groups, policy, legacy compat |
| Tool Groups | tools/groups.ts (268 lines) | Semantic grouping: "<service>:<level>" (read/write/confirm). Supports wildcards ("gmail:*") and exclusions ("gmail:!confirm") |
| Tool Classification | tools/tool-classification.ts (269 lines) | Every tool classified as read, write, or confirm. Plus MUTATION_FINGERPRINT_FIELDS and READ_IDENTITY_FIELDS |
| Tool Metadata | tools/tool-metadata.ts (1,763 lines) | Per-tool: schemaDescription, promptSummary, triggers, notToBeConfusedWith, parameterGuidance. Includes buildDisambiguationMatrix() for system prompt injection |
| Policy Pipeline | tools/policy/ (5 files, 402 lines) | 4-layer cascade: Channel → ConnectedIntegrations → Safety → UserOverrides. Resolves allowed tools per request |
| Mutation Fingerprinting | tools/mutation/ (4 files, 251 lines) | Per-field fingerprinting with SHA-256, MutationTracker with 5-min TTL |
| Confirmation Helper | tools/lib/with-confirmation.ts (64 lines) | requireConfirmation() — uses Mastra's native suspend/resume. No-op on iMessage (agent prompt handles it) |
Deleted Tools (-2,538 lines)
| Tool | Lines | Replaced By |
|---|---|---|
complex-task-tool.ts | 215 | Single agent's direct tool chaining |
compose-email-tool.ts | 567 | draft-email-tool.ts (271 lines) |
resume-workflow-tool.ts | 551 | Mastra native suspend/resume |
scheduling/schedule-meeting-tool.ts | 1,116 | find-available-slots-tool.ts (388 lines) |
start-schedule-meeting-tool.ts | 89 | Direct Inngest workflow trigger |
New Tools
| Tool | Lines | Purpose |
|---|---|---|
draft-email-tool.ts | 271 | Resolves recipient, generates AI draft, returns preview (classified as "read" — no side effects) |
find-available-slots-tool.ts | 388 | Fetches scheduling prefs, resolves attendees, queries FreeBusy, finds available slots |
bulkDismissRemindersTool | ~40 | Dismiss multiple reminders at once (in reminder-tools.ts) |
Modified Tools
Confirmation gates added to all confirm-classified tools using requireConfirmation():
- Gmail:
sendEmail,sendDraft,trashEmail,batchModifyEmails - Calendar:
createEvent,updateEvent,quickAddEvent,addAttendees,removeAttendees,deleteEvent,cancelEvent - Drive:
shareFile,updatePermission,removePermission,trashFile - Docs:
deleteDocument - Slack:
sendSlackMessage - Contacts:
deleteContact
Gmail tools (+902 lines): New batchFetchFullMessages(), gzip compression, new fetchSentRepliesTool, awaitingReplyTool, smartInboxTool.
Resolve recipient tool: Now uses tiered search (curated relationships → AI recommendations → broader sources) with early return on high-confidence match.
All tool descriptions rewritten to concise, parameter-focused format matching the metadata registry.
3. Processor Pipeline
12 new processors form a layered safety and intelligence system.
Input Processors (per-step)
| # | Processor | Hook | Purpose |
|---|---|---|---|
| 1 | DateTimeInjector | processInput | Injects date/time into user message (not system prompt) for prompt cache stability |
| 2 | MessageDeduplicator | processInput | Prevents OpenAI Responses API duplicate item_reference errors |
| 3 | ToolPolicyProcessor | processInputStep | Resolves allowed tools via 4-layer policy cascade. Caches after step 0 |
| 4 | ConfirmationGateProcessor (iMessage only) | processInputStep | Blocks CONFIRM tools unless prior turn communicated with user |
| 5 | MutationGuardProcessor | processInputStep | Prevents duplicate mutations via fingerprinting. warn for writes, block for confirms |
| 6 | RepeatCallDetector | processInputStep | Prevents redundant read calls via identity-based dedup |
| 7 | SendOnceGuard (iMessage only) | processInputStep | After sendResponse, disables all tools and forces empty return |
| 8 | ErrorRecoveryProcessor | processInputStep | Truncates oversized tool results (4k/8k chars) + classifies errors with recovery hints |
| 9 | StagedCompactionProcessor | processInputStep | Summarize-then-prune via gpt-4.1-nano. Channel-aware thresholds |
| 10 | ContextWindowGuard | processInputStep | Last-resort: warns at ~32k remaining, strips tools at ~16k remaining |
| 11 | EnsureFinalResponseProcessor | processInputStep | On final step, removes tools and forces response |
| 12 | TokenLimiter | processInputStep | Hard truncation safety net. Always LAST |
Output Processors (post-generation)
| Processor | Purpose |
|---|---|
| ToolResultTrimmer | Head+tail truncation (1.5k/3k chars) before memory saves. Strips verbose fields |
Two-Tier Truncation
| Layer | When | Limits | Purpose |
|---|---|---|---|
| ErrorRecoveryProcessor | Per-step (LLM view) | 4,000 / 8,000 chars | Rich data during current generation |
| ToolResultTrimmer | Post-generation (memory) | 1,500 / 3,000 chars | Lean storage across turns |
Mutation Safety Layering
Three distinct processors at different levels:
- ToolPolicyProcessor — which tools are available at all
- ConfirmationGateProcessor — which tools need prior user communication (iMessage)
- MutationGuardProcessor — which mutations have already been executed
4. Prompt Architecture
19 composable sections assembled by a builder pattern (prompts/builder.ts).
Priority Bands
| Range | Category | Sections |
|---|---|---|
| 0-99 | Identity & context | identity (0), context (10) |
| 100-199 | Tool sections | tool-listing (100), tool-call-style (110), task-planning (115), tool-capability-hints (120), contextual-references (130) |
| 200-299 | Capability instructions | email-composition (200), meeting-scheduling (210) |
| 300-399 | Behavioral rules | confirmation-behavior (300), cross-service (310), memory-recall (315), working-memory (320), web-formatting (330), imessage-formatting (330), response-delivery (335), reminders (340), greetings (350) |
| 400-499 | Error handling & safety | errors (400), critical-rules (410) |
Key Design Decisions
- Cache-stable system prompt: Date/time is NOT in the system prompt —
DateTimeInjectorputs it in user messages. This enables OpenAI's prompt caching (50% discount on cached tokens). - Channel-aware rendering: Many sections render different content per channel (identity tone, confirmation flow, formatting rules).
- Conditional sections:
email-compositiononly renders when Gmail connected,meeting-schedulingwhen Calendar connected,cross-servicewhen 2+ services connected. - Voice calibration:
identitysection appliestoneStyle(formal/brief/casual/balanced) from agent preferences with channel-specific adjustments. - Disambiguation matrix:
tool-capability-hintsdynamically builds a confusion matrix fromtool-metadata.tsfor connected services only.
5. Middleware
Decomposed into 5 focused middleware functions in middleware/index.ts:
| Middleware | Scope | Purpose |
|---|---|---|
authMiddleware | Global | JWT verification, gateway secret auth, sets userId |
bodyParsingMiddleware | POST requests | Extracts whitelisted context (22 keys) from request body |
contextPopulationMiddleware | API + custom routes | Fetches profile, preferences, connected integrations from Supabase (5-min cache) |
dateTimeMiddleware | Global | Formats current date/time in user's timezone |
sessionLoggingMiddleware | POST /chat | Records chat session activity (non-blocking) |
6. Messaging Gateway
The largest single change. Monolithic router.ts (2,904 lines) decomposed into a pipeline architecture (197-line thin orchestrator).
New Architecture
A. Channel Plugin System (channels/)
ChannelPlugininterface with capability metadata (supportsTypingIndicator,supportsReactions,supportsEffects,supportsMarkdown)ChannelRegistryfor lifecycle management- Plugins:
IMessagePlugin,SMSPlugin,AgentMailPlugin - Replaces hardcoded
if (channel === "imessage")with capability checks
B. Pipeline Stages (pipeline/)
1IncomingMessage
2 → CapabilityResolver (resolve userId/orgId)
3 → EmailDeduplicator (skip duplicates)
4 → ProfileEnricher (timezone, name, email, EA identity)
5 → ContextBuilder (datetime, identities, scheduling, integrations)
6 → RouteResolver (priority-based bindings)
7 → AgentCaller (HTTP call with retry + presence)
8 → ResponseProcessor (messaging-handled detection, markdown strip)
9 → SessionRecorder (token attribution)
10 → ProspectPostProcessor (lead scoring)C. Route Binding System (pipeline/bindings/)
| Priority | Binding | Action |
|---|---|---|
| 100 | SuspendedSchedulingWorkflow | Resume suspended workflow (email) |
| 90 | InngestApproval | Route to Inngest for meeting confirmation |
| 80 | CCScheduling | Start new scheduling workflow |
| 50 | Prospect | Route to sales-agent |
| 30 | EmailChannel | Route to email-orchestrator-agent |
| 30 | IMessageChannel | Route to imessage-consul-agent |
| 0 | Fallback | Route to consul-agent |
D. Session Queue (pipeline/session-queue.ts)
- Concurrency management for rapid-fire messages per session
- 4 modes:
queue(FIFO),collect(batch with timeout),interrupt(cancel + restart),followup(queue + combine)
E. ReplyDispatcher (lib/reply-dispatcher.ts)
- Unified message delivery: dedup (5s window), presence management, tapback delivery, paced multi-message, markdown stripping, iMessage effects
F. Extracted Libraries (lib/)
agentmail-history.ts— AgentMail conversation fetcherencryption.ts— AES-256-GCM decryptionintent-detection.ts— Reschedule intent detectionmarkdown.ts— Markdown strippingscheduling-api.ts— Scheduling workflow HTTP helperstool-response.ts— Tool response parsing helpers
7. Deleted Workflows & Services
Deleted Workflows (-5,532 lines)
| Workflow | Lines | Replacement |
|---|---|---|
hitl/calendar-action-workflow.ts | 1,778 | Tool-level requireConfirmation() + autoResumeSuspendedTools |
hitl/email-action-workflow.ts | 1,218 | Same |
hitl/drive-action-workflow.ts | 689 | Same |
hitl/slack-action-workflow.ts | 449 | Same |
compose-email-workflow.ts | 848 | Single agent chains draftEmail → sendEmail directly |
complex-task-workflow.ts | 501 | Single agent with maxSteps: 10 and direct tool access |
schedule-meeting-workflow.ts | 49 | Direct Inngest trigger |
Deleted Services (-983 lines)
| Service | Lines | Replacement |
|---|---|---|
approval-service.ts | 557 | Mastra native tool suspend/resume |
artifact-approval-handler.ts | 426 | Tool-level suspend/resume |
Deleted Utilities (-523+ lines)
| Utility | Lines | Reason |
|---|---|---|
utils/step-executor.ts | 313 | Only used by deleted complex-task-workflow.ts |
utils/variable-resolver.ts | ~100+ | Only used by step-executor.ts |
types/planning.ts | 318 | Type definitions for deleted plan/execute pattern |
Retained Workflows (unchanged)
emailTriageWorkflow,dailyBriefWorkflow,salesProcessingWorkflow,tagNotificationWorkflow,imessageSendWorkflow- Inngest workflows (email scheduling, schedule meeting, relationships)
8. Web App Changes
Chat Interface (chat-interface.tsx)
- Added
data-tool-call-suspendedrendering: Displays confirmation previews from suspended tools (HITL). ShowssuspendPayload.previeworsuspendPayload.messageas assistant message. - Simplified tool activity display: Removed
streamingToolNamestate,onDatacallback parsing, and associateduseEffect. Now relies solely onuseToolActivityhook.
Tool Display Names
- Added:
draft-email,find-available-slots - Renamed:
get-freebusy→get-free-busy,cancel-reminder→dismiss-reminder - Removed:
execute-complex-task,executeComplexTask
Relationships Components
- Refactored 4 dialog components to use React
keyprop pattern instead ofuseEffectfor form state sync.
9. New Documentation
| File | Lines | Purpose |
|---|---|---|
docs/AGENT_REFACTOR_PLAN.md | 970 | Comprehensive refactor plan with architecture analysis, problem statement, and phased implementation |
docs/PRODUCT_OVERVIEW.md | 234 | Product documentation |
docs/TOOL_SUSPENSION_COMPLETE.md | 218 | Completed tool-level HITL implementation docs |
docs/TOOL_SUSPENSION_IMPLEMENTATION.md | 95 | Technical implementation guide for tool suspension |
10. Tone & Voice Refinements
Across all remaining agents, exclamation marks and corporate pleasantries have been replaced with professional, direct language:
| Agent | Before | After |
|---|---|---|
| Onboarding | "Keep the experience magical" | "Demonstrate practical utility" |
| Onboarding | "Show the user how valuable you can be" | "Let the results speak for themselves" |
| Scheduling | "Happy to help find a time, here are some available:" | "Here are some available times:" |
| Email Composer | "Got it! Here are some updated times" | "Understood. Here are some updated times" |
| Email Triage | "Thanks for reaching out!" | "Thank you for reaching out." |
| iMessage Greetings | "What do you need?" | "How can I be of assistance?" |
| iMessage Formatting | (none) | Multi-bubble calendar events with meet links |
Commit History
| Hash | Message |
|---|---|
c07dcea | refactor(agents): single-agent architecture with intelligence improvements |
b496eb6 | refactor(agents): harden processor pipeline, fix tool classification, and improve tool disambiguation |
5389033 | refactor(agents): improve iMessage greeting tone and add multi-bubble calendar formatting |