MDX Limo
Agent Refactor

Executive Summary

This branch replaces a multi-agent orchestrator architecture (20 agents, 113 tools, 13 workflows) with a single-agent-with-direct-tools architecture. The core thesis: instead of an LLM deciding which sub-agent to route to (an extra hop costing latency and tokens), all ~96 tools are loaded directly onto one agent, with a processor pipeline handling safety, policy, and intelligence concerns.

Key outcomes:

  • 10 specialized agents deleted, replaced by 1 unified agent factory
  • 7 workflows deleted (4 HITL + 3 orchestration), replaced by tool-level suspend/resume
  • 2,904-line monolithic messaging router decomposed into 9-stage pipeline
  • 12 new input/output processors for safety, dedup, compaction, and error recovery
  • 19 composable prompt sections with channel-aware rendering
  • 4-layer cascading tool policy pipeline
  • Per-tool mutation fingerprinting for dedup

Table of Contents

  1. Agent Architecture
  2. Tool System
  3. Processor Pipeline
  4. Prompt Architecture
  5. Middleware
  6. Messaging Gateway
  7. Deleted Workflows & Services
  8. Web App Changes
  9. New Documentation
  10. Tone & Voice Refinements

1. Agent Architecture

Before (main): 20 agents

AgentModelPurpose
webOrchestratorAgentgpt-4.1-miniCentral router for web chat, delegated to 7 sub-agents + 4 HITL workflows
imessageOrchestratorAgentgpt-4.1-miniCentral router for iMessage/SMS
planningAgentgpt-4oReWOO-style task decomposition
validatorAgentgpt-4oMaker-Checker result validation
analysisAgentgpt-4o-miniContent extraction/classification
googleContactsAgentgpt-4o-mini9 contact tools
googleDocsAgentgpt-4o-mini13 doc tools
googleDriveQueryAgentgpt-4o-mini6 drive read tools
googleDriveActionAgentgpt-4o-mini14 drive write tools
slackQueryAgentgpt-4o-mini7 slack read tools
slackActionAgentgpt-4o-mini6 slack write tools
imessageAgentgpt-4o-mini2 iMessage tools
reminderAgentgpt-4o-miniReminder tools
+ 7 retained agentsvariousemail orchestrator, gmail, calendar, scheduling, triage, sales, onboarding

After (branch): 10 agents

AgentModelPurpose
consulAgent (NEW)gpt-4.1Unified web chat agent, ~96 tools loaded directly
imessageConsulAgent (NEW)gpt-4.1-miniiMessage/SMS variant, same factory
emailOrchestratorAgent(kept)Inbound email triage routing
gmailQueryAgent(kept)Used by email orchestrator
gmailActionAgent(kept)Used by email orchestrator
googleCalendarQueryAgent(kept)Used by email orchestrator
googleCalendarActionAgent(kept)Used by email orchestrator
schedulingAgent(kept)Scheduling workflow
emailTriageAgent(kept)Email triage workflow
salesAgent(kept)Sales processing
onboardingDemoAgent(kept)New user onboarding

Unified Agent Factory (agents/consul-agent.ts, 253 lines)

createConsulAgent(channel) generates both variants from a single code path:

SettingWebiMessage
Modelgpt-4.1 (fallback: gpt-4.1-mini)gpt-4.1-mini (fallback: gpt-4o-mini)
MemorylastMessages: falselastMessages: 10
maxSteps108
autoResumeSuspendedToolsfalse (UI buttons)true (auto-resume)
Extra processors--ConfirmationGate, SendOnceGuard
Compaction thresholds0.70 / 0.850.55 / 0.70
Semantic recalltopK: 3, messageRange: 2topK: 2, messageRange: 1
Observational memory30k / 50k tokens15k / 30k tokens

Memory: Three-tier system:

  • Working memory (structured user preferences, resource-scoped)
  • Semantic recall (cross-thread RAG via LibSQL vector)
  • Observational memory (auto-summaries via gpt-4.1-nano, resource-scoped)

2. Tool System

+5,282 / -2,866 lines across 34 files in tools/

New Infrastructure

ComponentFile(s)Purpose
Tool Registrytools/registry.ts (261 lines)Single source of truth — flat map of all ~96 tools
Tool Indextools/index.ts (101 lines)Public API: exports registry, groups, policy, legacy compat
Tool Groupstools/groups.ts (268 lines)Semantic grouping: "<service>:<level>" (read/write/confirm). Supports wildcards ("gmail:*") and exclusions ("gmail:!confirm")
Tool Classificationtools/tool-classification.ts (269 lines)Every tool classified as read, write, or confirm. Plus MUTATION_FINGERPRINT_FIELDS and READ_IDENTITY_FIELDS
Tool Metadatatools/tool-metadata.ts (1,763 lines)Per-tool: schemaDescription, promptSummary, triggers, notToBeConfusedWith, parameterGuidance. Includes buildDisambiguationMatrix() for system prompt injection
Policy Pipelinetools/policy/ (5 files, 402 lines)4-layer cascade: Channel → ConnectedIntegrations → Safety → UserOverrides. Resolves allowed tools per request
Mutation Fingerprintingtools/mutation/ (4 files, 251 lines)Per-field fingerprinting with SHA-256, MutationTracker with 5-min TTL
Confirmation Helpertools/lib/with-confirmation.ts (64 lines)requireConfirmation() — uses Mastra's native suspend/resume. No-op on iMessage (agent prompt handles it)

Deleted Tools (-2,538 lines)

ToolLinesReplaced By
complex-task-tool.ts215Single agent's direct tool chaining
compose-email-tool.ts567draft-email-tool.ts (271 lines)
resume-workflow-tool.ts551Mastra native suspend/resume
scheduling/schedule-meeting-tool.ts1,116find-available-slots-tool.ts (388 lines)
start-schedule-meeting-tool.ts89Direct Inngest workflow trigger

New Tools

ToolLinesPurpose
draft-email-tool.ts271Resolves recipient, generates AI draft, returns preview (classified as "read" — no side effects)
find-available-slots-tool.ts388Fetches scheduling prefs, resolves attendees, queries FreeBusy, finds available slots
bulkDismissRemindersTool~40Dismiss multiple reminders at once (in reminder-tools.ts)

Modified Tools

Confirmation gates added to all confirm-classified tools using requireConfirmation():

  • Gmail: sendEmail, sendDraft, trashEmail, batchModifyEmails
  • Calendar: createEvent, updateEvent, quickAddEvent, addAttendees, removeAttendees, deleteEvent, cancelEvent
  • Drive: shareFile, updatePermission, removePermission, trashFile
  • Docs: deleteDocument
  • Slack: sendSlackMessage
  • Contacts: deleteContact

Gmail tools (+902 lines): New batchFetchFullMessages(), gzip compression, new fetchSentRepliesTool, awaitingReplyTool, smartInboxTool.

Resolve recipient tool: Now uses tiered search (curated relationships → AI recommendations → broader sources) with early return on high-confidence match.

All tool descriptions rewritten to concise, parameter-focused format matching the metadata registry.


3. Processor Pipeline

12 new processors form a layered safety and intelligence system.

Input Processors (per-step)

#ProcessorHookPurpose
1DateTimeInjectorprocessInputInjects date/time into user message (not system prompt) for prompt cache stability
2MessageDeduplicatorprocessInputPrevents OpenAI Responses API duplicate item_reference errors
3ToolPolicyProcessorprocessInputStepResolves allowed tools via 4-layer policy cascade. Caches after step 0
4ConfirmationGateProcessor (iMessage only)processInputStepBlocks CONFIRM tools unless prior turn communicated with user
5MutationGuardProcessorprocessInputStepPrevents duplicate mutations via fingerprinting. warn for writes, block for confirms
6RepeatCallDetectorprocessInputStepPrevents redundant read calls via identity-based dedup
7SendOnceGuard (iMessage only)processInputStepAfter sendResponse, disables all tools and forces empty return
8ErrorRecoveryProcessorprocessInputStepTruncates oversized tool results (4k/8k chars) + classifies errors with recovery hints
9StagedCompactionProcessorprocessInputStepSummarize-then-prune via gpt-4.1-nano. Channel-aware thresholds
10ContextWindowGuardprocessInputStepLast-resort: warns at ~32k remaining, strips tools at ~16k remaining
11EnsureFinalResponseProcessorprocessInputStepOn final step, removes tools and forces response
12TokenLimiterprocessInputStepHard truncation safety net. Always LAST

Output Processors (post-generation)

ProcessorPurpose
ToolResultTrimmerHead+tail truncation (1.5k/3k chars) before memory saves. Strips verbose fields

Two-Tier Truncation

LayerWhenLimitsPurpose
ErrorRecoveryProcessorPer-step (LLM view)4,000 / 8,000 charsRich data during current generation
ToolResultTrimmerPost-generation (memory)1,500 / 3,000 charsLean storage across turns

Mutation Safety Layering

Three distinct processors at different levels:

  1. ToolPolicyProcessor — which tools are available at all
  2. ConfirmationGateProcessor — which tools need prior user communication (iMessage)
  3. MutationGuardProcessor — which mutations have already been executed

4. Prompt Architecture

19 composable sections assembled by a builder pattern (prompts/builder.ts).

Priority Bands

RangeCategorySections
0-99Identity & contextidentity (0), context (10)
100-199Tool sectionstool-listing (100), tool-call-style (110), task-planning (115), tool-capability-hints (120), contextual-references (130)
200-299Capability instructionsemail-composition (200), meeting-scheduling (210)
300-399Behavioral rulesconfirmation-behavior (300), cross-service (310), memory-recall (315), working-memory (320), web-formatting (330), imessage-formatting (330), response-delivery (335), reminders (340), greetings (350)
400-499Error handling & safetyerrors (400), critical-rules (410)

Key Design Decisions

  • Cache-stable system prompt: Date/time is NOT in the system prompt — DateTimeInjector puts it in user messages. This enables OpenAI's prompt caching (50% discount on cached tokens).
  • Channel-aware rendering: Many sections render different content per channel (identity tone, confirmation flow, formatting rules).
  • Conditional sections: email-composition only renders when Gmail connected, meeting-scheduling when Calendar connected, cross-service when 2+ services connected.
  • Voice calibration: identity section applies toneStyle (formal/brief/casual/balanced) from agent preferences with channel-specific adjustments.
  • Disambiguation matrix: tool-capability-hints dynamically builds a confusion matrix from tool-metadata.ts for connected services only.

5. Middleware

Decomposed into 5 focused middleware functions in middleware/index.ts:

MiddlewareScopePurpose
authMiddlewareGlobalJWT verification, gateway secret auth, sets userId
bodyParsingMiddlewarePOST requestsExtracts whitelisted context (22 keys) from request body
contextPopulationMiddlewareAPI + custom routesFetches profile, preferences, connected integrations from Supabase (5-min cache)
dateTimeMiddlewareGlobalFormats current date/time in user's timezone
sessionLoggingMiddlewarePOST /chatRecords chat session activity (non-blocking)

6. Messaging Gateway

The largest single change. Monolithic router.ts (2,904 lines) decomposed into a pipeline architecture (197-line thin orchestrator).

New Architecture

A. Channel Plugin System (channels/)

  • ChannelPlugin interface with capability metadata (supportsTypingIndicator, supportsReactions, supportsEffects, supportsMarkdown)
  • ChannelRegistry for lifecycle management
  • Plugins: IMessagePlugin, SMSPlugin, AgentMailPlugin
  • Replaces hardcoded if (channel === "imessage") with capability checks

B. Pipeline Stages (pipeline/)

1IncomingMessage 2 → CapabilityResolver (resolve userId/orgId) 3 → EmailDeduplicator (skip duplicates) 4 → ProfileEnricher (timezone, name, email, EA identity) 5 → ContextBuilder (datetime, identities, scheduling, integrations) 6 → RouteResolver (priority-based bindings) 7 → AgentCaller (HTTP call with retry + presence) 8 → ResponseProcessor (messaging-handled detection, markdown strip) 9 → SessionRecorder (token attribution) 10 → ProspectPostProcessor (lead scoring)

C. Route Binding System (pipeline/bindings/)

PriorityBindingAction
100SuspendedSchedulingWorkflowResume suspended workflow (email)
90InngestApprovalRoute to Inngest for meeting confirmation
80CCSchedulingStart new scheduling workflow
50ProspectRoute to sales-agent
30EmailChannelRoute to email-orchestrator-agent
30IMessageChannelRoute to imessage-consul-agent
0FallbackRoute to consul-agent

D. Session Queue (pipeline/session-queue.ts)

  • Concurrency management for rapid-fire messages per session
  • 4 modes: queue (FIFO), collect (batch with timeout), interrupt (cancel + restart), followup (queue + combine)

E. ReplyDispatcher (lib/reply-dispatcher.ts)

  • Unified message delivery: dedup (5s window), presence management, tapback delivery, paced multi-message, markdown stripping, iMessage effects

F. Extracted Libraries (lib/)

  • agentmail-history.ts — AgentMail conversation fetcher
  • encryption.ts — AES-256-GCM decryption
  • intent-detection.ts — Reschedule intent detection
  • markdown.ts — Markdown stripping
  • scheduling-api.ts — Scheduling workflow HTTP helpers
  • tool-response.ts — Tool response parsing helpers

7. Deleted Workflows & Services

Deleted Workflows (-5,532 lines)

WorkflowLinesReplacement
hitl/calendar-action-workflow.ts1,778Tool-level requireConfirmation() + autoResumeSuspendedTools
hitl/email-action-workflow.ts1,218Same
hitl/drive-action-workflow.ts689Same
hitl/slack-action-workflow.ts449Same
compose-email-workflow.ts848Single agent chains draftEmailsendEmail directly
complex-task-workflow.ts501Single agent with maxSteps: 10 and direct tool access
schedule-meeting-workflow.ts49Direct Inngest trigger

Deleted Services (-983 lines)

ServiceLinesReplacement
approval-service.ts557Mastra native tool suspend/resume
artifact-approval-handler.ts426Tool-level suspend/resume

Deleted Utilities (-523+ lines)

UtilityLinesReason
utils/step-executor.ts313Only used by deleted complex-task-workflow.ts
utils/variable-resolver.ts~100+Only used by step-executor.ts
types/planning.ts318Type definitions for deleted plan/execute pattern

Retained Workflows (unchanged)

  • emailTriageWorkflow, dailyBriefWorkflow, salesProcessingWorkflow, tagNotificationWorkflow, imessageSendWorkflow
  • Inngest workflows (email scheduling, schedule meeting, relationships)

8. Web App Changes

Chat Interface (chat-interface.tsx)

  • Added data-tool-call-suspended rendering: Displays confirmation previews from suspended tools (HITL). Shows suspendPayload.preview or suspendPayload.message as assistant message.
  • Simplified tool activity display: Removed streamingToolName state, onData callback parsing, and associated useEffect. Now relies solely on useToolActivity hook.

Tool Display Names

  • Added: draft-email, find-available-slots
  • Renamed: get-freebusyget-free-busy, cancel-reminderdismiss-reminder
  • Removed: execute-complex-task, executeComplexTask

Relationships Components

  • Refactored 4 dialog components to use React key prop pattern instead of useEffect for form state sync.

9. New Documentation

FileLinesPurpose
docs/AGENT_REFACTOR_PLAN.md970Comprehensive refactor plan with architecture analysis, problem statement, and phased implementation
docs/PRODUCT_OVERVIEW.md234Product documentation
docs/TOOL_SUSPENSION_COMPLETE.md218Completed tool-level HITL implementation docs
docs/TOOL_SUSPENSION_IMPLEMENTATION.md95Technical implementation guide for tool suspension

10. Tone & Voice Refinements

Across all remaining agents, exclamation marks and corporate pleasantries have been replaced with professional, direct language:

AgentBeforeAfter
Onboarding"Keep the experience magical""Demonstrate practical utility"
Onboarding"Show the user how valuable you can be""Let the results speak for themselves"
Scheduling"Happy to help find a time, here are some available:""Here are some available times:"
Email Composer"Got it! Here are some updated times""Understood. Here are some updated times"
Email Triage"Thanks for reaching out!""Thank you for reaching out."
iMessage Greetings"What do you need?""How can I be of assistance?"
iMessage Formatting(none)Multi-bubble calendar events with meet links

Commit History

HashMessage
c07dcearefactor(agents): single-agent architecture with intelligence improvements
b496eb6refactor(agents): harden processor pipeline, fix tool classification, and improve tool disambiguation
5389033refactor(agents): improve iMessage greeting tone and add multi-bubble calendar formatting
Agent Refactor | MDX Limo