Consul Agent Architecture Refactor Plan

Executive Summary

This document outlines the refactoring of Consul's agent system from a rigid multi-agent orchestrator pattern (28 agents, 114 tools, 13 workflows, 5 layers of indirection) to a single intelligent agent with direct tool access, dynamic tool discovery, and tool-level HITL. The goal: an AI assistant that reasons and acts instead of routes and follows scripts.

The Problem

The current architecture turns a powerful LLM into a dumb router. When a user says "archive the newsletters," the AI:

Identifies newsletters through LLM intelligence (reading sender names)
Loses that intelligence when routing to emailActionWorkflow
The workflow searches label:Newsletter (doesn't exist) and fails
The user gets "I couldn't find any emails matching the newsletter category"

This is a symptom of a systemic issue: over-structured agent design that strips the AI of its reasoning ability.

The Solution

One agent. Direct tool access. Let the AI think.

Dimension	Current	Target
Agents	28 (17 active)	1 main + 2-3 specialists
Routing	170 lines of instruction rules	LLM reasoning + tool descriptions
Tools	Split across sub-agents	All tools on main agent (via ToolSearchProcessor)
HITL	450-line workflow state machines	Tool-level `suspend()` + `autoResumeSuspendedTools`
Layers	5 (request → orchestrator → sub-agent → workflow → step → API)	2 (request → agent → tool → API)

Why This Will Be Better

Anthropic: "Find the simplest solution possible, and only increase complexity when needed." (Building Effective Agents)

OpenAI: "A single agent can handle many tasks by incrementally adding tools." (Building Agents Guide)

Microsoft: "Don't assume role separation requires multiple agents. Often, a single agent using persona switching and context-aware policies can satisfy role-based behavior." (Cloud Adoption Framework)

LangChain: "Single agents are simpler to build, reason about, and debug." (Multi-Agent Architecture Guide)

Multi-agent is justified only for: security/compliance boundaries, separate teams, or proven single-agent failures. Consul has none of these.

Current Architecture Analysis

Layer Map

User Message
  ↓ (1) Chat Endpoint (chat-with-logging.ts)
    ↓ (2) Web Orchestrator (170 lines of routing rules)
      ↓ (3) Sub-Agent (gmailQueryAgent, gmailActionAgent, etc.)
        ↓ (4) HITL Workflow (emailActionWorkflow — 1256 LOC state machine)
          ↓ (5) Tool → Gmail API

5 layers. 5 handoff points. 5 places where intelligence gets lost.

Registered Components

Agents (24):

3 orchestrators (web, iMessage, email)
10 domain agents (Gmail query/action, Calendar query/action, Drive query/action, Slack query/action, Contacts, Docs)
5 utility agents (scheduling, triage, sales, planning, analysis)
6 support agents (validator, reminder, onboarding, iMessage, etc.)

Workflows (13):

4 HITL approval workflows (email action, calendar action, drive action, slack action)
2 conversational workflows (compose email, schedule meeting)
7 background workflows (daily brief, email triage, sales processing, etc.)

Tools (80+):

Gmail: 19 tools
Calendar: 14 tools
Drive: 20 tools
Slack: 13 tools
Contacts: 9 tools
Docs: 6+ tools
Orchestration: 10+ tools (compose, schedule, complex task, resume, etc.)
Misc: reminders, feedback, etc.

Where Intelligence Gets Lost

Orchestrator routing (web-orchestrator-agent.ts:176-269): 170 lines of hardcoded if-then rules that turn reasoning into pattern matching.
Query/Action agent split: Forces the orchestrator to decide read vs. write before understanding the request. "Find newsletters then archive them" requires two different agents.
HITL workflows strip AI agency: email-action-workflow.ts is 1256 lines of manual state machine. When the search returns 0 results, the workflow throws an error — it can't reason about alternatives.
smartInboxTool data gap (gmail-tools.ts:2438-2566): Marketing emails get count-only (maxResults: 1), details skipped (if (category === "Marketing") continue), newsletters hardcoded to 0.
Label-based search disconnect: The AI identifies newsletters by sender inference, but the workflow searches by Gmail label (which doesn't exist).

What Works Well (Keep)

Tool implementations: Well-isolated, clear input/output schemas, good error handling
Three-tier token resolution (token-resolver.ts): Input → RequestContext → Supabase with auto-refresh
Middleware pipeline: Auth → body parsing → context population → datetime — clean and necessary
MessageDeduplicator processor: Prevents OpenAI Responses API duplicate errors
Connected integrations cache: 5-min TTL prevents 6 Supabase queries per request
Memory configurations: Web (lastMessages=0 for useChat) vs iMessage (lastMessages=15) is intentional
Background workflows: Daily brief, email triage, sales processing — these are batch jobs, not interactive

Target Architecture

Single Agent with Dynamic Tools

User Message
  ↓ Chat Endpoint (chat-with-logging.ts — keep as-is)
    ↓ Consul Agent (single agent, all tools accessible via ToolSearchProcessor)
      ↓ Tool calls directly (search, archive, send, create event, etc.)
        ↓ Tool-level suspend() for confirmations
          ↓ User confirms → tool resumes → API call

2-3 layers instead of 5. The AI reasons about what tools to use.

Core Agent Definition

import { Agent } from "@mastra/core/agent";
import { Memory } from "@mastra/memory";
import { ToolSearchProcessor, TokenLimiter, ToolCallFilter } from "@mastra/core/processors";
import { MessageDeduplicator } from "../processors/message-deduplicator";

// Core tools always available (high-frequency, simple)
import { coreTools } from "../tools/core";

// All domain tools for ToolSearchProcessor discovery
import { allDiscoverableTools } from "../tools";

export const consulAgent = new Agent({
  id: "consul-agent",
  model: "openai/gpt-4.1-mini",

  instructions: async ({ requestContext }) => {
    // Dynamic, context-aware instructions (see detailed spec below)
    return buildInstructions(requestContext);
  },

  // Core tools always loaded (reminders, feedback — used every session)
  tools: async ({ requestContext }) => {
    return {
      ...coreTools,
      // Channel-specific tools
      ...(requestContext?.get("channel") === "imessage" ? imessageTools : {}),
    };
  },

  memory: new Memory({
    options: {
      lastMessages: false,        // useChat manages history client-side
      semanticRecall: false,
      workingMemory: {
        enabled: true,
        scope: "resource",        // Cross-session user preferences
        schema: userPreferencesSchema,  // Structured merge semantics
      },
    },
  }),

  inputProcessors: [
    new MessageDeduplicator(),     // CRITICAL: OpenAI Responses API dedup
    new ToolSearchProcessor({
      tools: allDiscoverableTools, // ~80 tools discoverable on demand
      search: { topK: 8, minScore: 0.1 },
    }),
    new ToolCallFilter({
      exclude: ["compose-email", "schedule-meeting"],
    }),
    new TokenLimiter(127000),      // Must be LAST
  ],

  defaultOptions: {
    autoResumeSuspendedTools: true,  // Conversational HITL
    maxSteps: 10,
    providerOptions: {
      openai: { parallel_tool_calls: false },
    },
  },
});

Tool Organization

Core Tools (Always Loaded)

These are high-frequency tools that don't need discovery:

- createReminder, listReminders, cancelReminder, editReminder
- submitFeedback
- resolveRecipient (contact lookup — needed for many flows)

Discoverable Tools (Via ToolSearchProcessor)

Organized by domain, loaded on demand:

Gmail (19 tools):
  smartInbox, getEmail, getThread, searchEmails, resolveEmail, listDrafts, listLabels
  sendEmail*, createDraft, updateDraft, sendDraft
  archiveEmails*, applyLabel, removeLabel, markAsRead, markAsUnread, star
  trashEmails*, deleteEmail, deleteDraft

Calendar (14 tools):
  getEvent, listEvents, listCalendars, searchEvents, getFreeBusy
  createEvent*, quickAddEvent, updateEvent*
  moveEvent, setEventColor, addAttendees, removeAttendees
  deleteEvent*, cancelEvent*

Drive (20 tools): [list, search, download, upload*, create*, share*, delete*, ...]
Slack (13 tools): [listChannels, getHistory, searchMessages, sendMessage*, ...]
Contacts (9 tools): [list, get, search, create*, update*, delete*, ...]
Docs (6+ tools): [get, search, create*, update*, ...]

* = tools with suspend() for HITL confirmation

The Agent Discovers and Loads Tools Dynamically

When user says "archive my newsletters":

Agent calls search_tools("archive email") → finds archiveEmails, searchEmails
Agent calls load_tool("archiveEmails") and load_tool("searchEmails")
Agent calls searchEmails({ query: "from:sam OR from:ai-collective OR from:zulie" })
Agent calls archiveEmails({ messageIds: [...] }) → tool suspends with preview
User confirms → tool resumes → emails archived

The AI reasons about how to find the newsletters (by sender, since it identified them earlier) instead of blindly searching label:Newsletter.

HITL: Tool-Level Suspend/Resume

Replace 1256-line HITL workflows with tool-level suspend/resume.

Example: Archive Emails Tool

export const archiveEmailsTool = createTool({
  id: "archive-emails",
  description: "Archive one or more emails. Removes them from inbox. Suspends for user confirmation before executing.",
  inputSchema: z.object({
    messageIds: z.array(z.string()).describe("Gmail message IDs to archive"),
  }),
  outputSchema: z.object({
    archived: z.number(),
    messageIds: z.array(z.string()),
  }),
  suspendSchema: z.object({
    preview: z.string().describe("Formatted preview of emails to archive"),
    count: z.number(),
    emails: z.array(z.object({
      subject: z.string().optional(),
      from: z.string().optional(),
    })),
  }),
  resumeSchema: z.object({
    approved: z.boolean(),
  }),
  execute: async (input, context) => {
    const { resumeData, suspend } = context?.agent ?? {};
    const accessToken = await resolveGoogleToken("gmail", undefined, context?.requestContext);

    if (!accessToken) {
      throw new Error("Gmail not connected. Connect your Gmail account in Settings.");
    }

    // If not yet approved, fetch previews and suspend
    if (!resumeData?.approved) {
      const emails = await batchFetchMetadata(input.messageIds, accessToken);
      const preview = emails
        .map(e => `- **${e.subject}** from ${e.from}`)
        .join("\n");

      return suspend?.({
        preview,
        count: input.messageIds.length,
        emails: emails.map(e => ({ subject: e.subject, from: e.from })),
      });
    }

    // Approved — execute
    await batchModify(accessToken, input.messageIds, { removeLabelIds: ["INBOX"] });
    return { archived: input.messageIds.length, messageIds: input.messageIds };
  },
});

With autoResumeSuspendedTools: true, the conversation flows naturally:

User: "Archive all the newsletters"
Agent: [searches emails, finds 7] "I found 7 newsletters to archive:
  - **AI Weekly** from The AI Collective
  - **Product Update** from Zulie
  - ... (5 more)
  Archive these 7 emails?"
User: "Yes"
Agent: [auto-resumes tool] "Done! Archived 7 newsletters."

No workflow. No state machine. The AI handles the entire flow.

Which Current Workflows Convert to Tool Suspend

Current Workflow	Lines	Replacement	Rationale
emailActionWorkflow	1256	Tool-level suspend on action tools (archive, trash, delete, label)	Simple approve/decline. AI handles search/preview.
calendarActionWorkflow	1779	Tool-level suspend on mutating calendar tools (create, update, delete)	Same pattern — preview then confirm.
driveActionWorkflow	~800	Tool-level suspend on drive mutation tools	Same pattern.
slackActionWorkflow	~600	Tool-level suspend on sendSlackMessage	Same pattern.
composeEmailWorkflow	849	Consolidated compose-email tool with multi-turn suspend	Keep as sophisticated tool with suspend/resume for draft preview + edit cycles.
scheduleMeetingWorkflow	Inngest	Keep as Inngest function	Complex async scheduling with external availability — genuinely needs workflow.

Which Workflows to Keep

Workflow	Keep?	Why
dailyBriefWorkflow	Yes	Batch cron job, no user interaction
emailTriageWorkflow	Yes	Background classification
salesProcessingWorkflow	Yes	Background email handling
imessageSendWorkflow	Yes	Gateway integration
tagNotificationWorkflow	Yes	Background notification
scheduleMeetingFunction (Inngest)	Yes	Complex async scheduling

These are background/batch workflows, not interactive HITL gates. They stay.

Instructions Design

The agent instructions should be focused on identity and behavior, not routing. Tool descriptions handle the "when to use this tool" question.

function buildInstructions(requestContext: RequestContext): string {
  const userName = requestContext?.get("userName");
  const userTimezone = requestContext?.get("userTimezone");
  const dateTime = requestContext?.get("currentDateTime");
  const connected = requestContext?.get("connectedIntegrations");
  const agentPrefs = requestContext?.get("agentPreferences");
  const assistantName = agentPrefs?.displayName || "Consul";

  return `You are ${assistantName}, a personal AI executive assistant${userName ? ` for ${userName}` : ""}.

## Identity
Confident, direct, helpful — not robotic. Brief for actions, thorough for questions.
${userName ? `Use "${userName.split(" ")[0]}" occasionally, not every message.` : ""}

## Context
${dateTime ? `Now: ${dateTime.dayOfWeek}, ${dateTime.date} at ${dateTime.time} (${dateTime.timezone})` : `Timezone: ${userTimezone}`}

## Connected Services
${connected?.gmail ? "Gmail: Connected" : "Gmail: Not connected"}
${connected?.calendar ? "Google Calendar: Connected" : "Google Calendar: Not connected"}
${connected?.drive ? "Google Drive: Connected" : "Google Drive: Not connected"}
${connected?.slack ? "Slack: Connected" : "Slack: Not connected"}
${connected?.contacts ? "Google Contacts: Connected" : "Google Contacts: Not connected"}
${connected?.docs ? "Google Docs: Connected" : "Google Docs: Not connected"}

## How to Work
- Use **search_tools** to find tools for the task, then **load_tool** to make them available.
- For read operations, call tools directly. For write/delete operations, tools will ask for confirmation.
- When searching for emails, be creative: search by sender, subject, date — not just labels.
- If a tool search returns no results, try different keywords.
- Chain tool calls to accomplish complex tasks (you have up to 10 steps).
- ALWAYS fetch fresh data — never reuse stale data from conversation history.

## Confirmation Behavior
- Emails, calendar events, file operations: tools automatically suspend for your confirmation.
- You'll see a preview. Present it clearly to the user and ask if they want to proceed.
- When the user confirms, the tool resumes automatically.

## Formatting
- Use markdown for clarity: **bold** for emphasis, bullet lists for items.
- Keep responses concise. "Done! Archived 7 newsletters." not "I have successfully completed the archival process for 7 newsletter emails."

## Reminders
- Pass times in LOCAL format (${userTimezone}) — NO 'Z' suffix.

## Errors
- Service not connected: "I don't have access to [service]. Connect it in **Settings**."
- Tool fails: friendly message + offer to retry.
`;
}

Key change: No routing rules. No "if gmail connected, route to gmailQueryAgent." The agent sees what's connected and uses tools accordingly. Tool descriptions tell the AI when each tool is appropriate.

Detailed Implementation Plan

Phase 0: Preparation

Estimated scope: Small. No code changes to production.

0.1 — Create feature branch

git checkout -b refactor/single-agent-architecture

0.2 — Audit tool descriptions Review all 80+ tools and ensure each has a clear, specific description that tells the AI when to use it. This is critical for ToolSearchProcessor accuracy.

Good: "Search Gmail for emails matching criteria. Use Gmail query syntax: from:, subject:, is:unread, newer_than:, etc." Bad: "Search emails"

0.3 — Create tool index file Create apps/agents/src/mastra/tools/index.ts that exports all tools grouped by domain:

// Core tools — always loaded
export const coreTools = {
  createReminder,
  listReminders,
  cancelReminder,
  editReminder,
  resolveRecipient,
  submitFeedback,
};

// All discoverable tools — for ToolSearchProcessor
export const allDiscoverableTools = {
  // Gmail
  smartInbox: smartInboxTool,
  getEmail: getEmailTool,
  getThread: getThreadTool,
  searchEmails: searchEmailsTool,
  // ... all other tools
};

Phase 1: Build the New Agent

Estimated scope: Medium. Create new files alongside existing ones.

1.1 — Create the Consul agent New file: apps/agents/src/mastra/agents/consul-agent.ts

Implements the single agent with:

Dynamic instructions via requestContext
Core tools always loaded
ToolSearchProcessor for discoverable tools
Memory with working memory (schema-based)
autoResumeSuspendedTools: true
Existing processors (MessageDeduplicator, ToolCallFilter, TokenLimiter)

1.2 — Add suspend/resume to write tools Modify existing tool files to add suspendSchema, resumeSchema, and suspend logic to write/delete tools:

Tool File	Tools to Modify
gmail-tools.ts	sendEmail, archiveEmails (new consolidated), trashEmails (new), applyLabel
google-calendar-tools.ts	createEvent, updateEvent, deleteEvent, cancelEvent
google-drive-tools.ts	uploadFile, deleteFile, shareFile, trashFile
slack-tools.ts	sendSlackMessage

Pattern for each:

// Add to existing tool definition:
suspendSchema: z.object({ preview: z.string(), count: z.number() }),
resumeSchema: z.object({ approved: z.boolean() }),

// Wrap execute with suspend logic:
execute: async (input, context) => {
  const { resumeData, suspend } = context?.agent ?? {};
  if (!resumeData?.approved) {
    const preview = await buildPreview(input);
    return suspend?.({ preview, count: input.items.length });
  }
  // Original execute logic
  return await originalLogic(input, context);
};

1.3 — Create consolidated archive/trash/organize tools Instead of separate removeInboxLabelTool, markAsReadTool, etc., create higher-level tools:

// New: Consolidates archive, mark read, mark unread, star, unstar
export const organizeEmailsTool = createTool({
  id: "organize-emails",
  description: "Organize emails: archive, mark read/unread, star/unstar. Suspends for confirmation on bulk operations.",
  inputSchema: z.object({
    messageIds: z.array(z.string()),
    action: z.enum(["archive", "markRead", "markUnread", "star", "unstar"]),
  }),
  // ... suspend for bulk, execute directly for single
});

// New: Consolidates trash and delete
export const deleteEmailsTool = createTool({
  id: "delete-emails",
  description: "Move emails to trash or permanently delete. Always suspends for confirmation.",
  inputSchema: z.object({
    messageIds: z.array(z.string()),
    permanent: z.boolean().default(false),
  }),
  // ... always suspend (destructive)
});

1.4 — Refactor composeEmailTool The current compose email workflow (849 lines) becomes a sophisticated tool with multi-turn suspend:

export const composeEmailTool = createTool({
  id: "compose-email",
  description: "Compose and send an email. Handles recipient lookup, AI drafting, preview, edits, and sending. Suspends for user to review draft before sending.",
  inputSchema: z.object({
    recipient: z.string().describe("Recipient name or email"),
    contentHint: z.string().describe("What the email should be about"),
    mode: z.enum(["compose", "reply", "reply_all"]).default("compose"),
    replyToMessageId: z.string().optional(),
  }),
  suspendSchema: z.object({
    draft: z.object({
      to: z.string(),
      subject: z.string(),
      body: z.string(),
    }),
    message: z.string(),
  }),
  resumeSchema: z.object({
    approved: z.boolean(),
    edits: z.object({
      subject: z.string().optional(),
      body: z.string().optional(),
    }).optional(),
  }),
  execute: async (input, context) => {
    const { resumeData, suspend } = context?.agent ?? {};

    if (!resumeData) {
      // First call: resolve recipient, generate draft, suspend for preview
      const recipient = await resolveRecipient(input.recipient, context);
      const draft = await generateDraft(input, recipient, context);
      return suspend?.({
        draft: { to: recipient.email, subject: draft.subject, body: draft.body },
        message: "Review this email before sending:",
      });
    }

    if (!resumeData.approved) {
      return { sent: false, reason: "User declined" };
    }

    // Apply edits if any, then send
    const finalDraft = applyEdits(resumeData);
    await sendViaGmail(finalDraft, context);
    return { sent: true, to: finalDraft.to, subject: finalDraft.subject };
  },
});

1.5 — Fix the smartInboxTool Address the root bug: return marketing/newsletter message IDs, not just counts.

- // MARKETING: Just get count (don't need details)
- fetchMessageIds(`label:${marketingTag.gmail_label_id}`, 1)
+ // MARKETING: Get message IDs (needed for follow-up actions)
+ fetchMessageIds(`label:${marketingTag.gmail_label_id}`, 20)

- if (category === "Marketing") continue; // Skip marketing details
+ // Include marketing details so agent can act on them

- newsletters: 0, // Could add separate newsletter detection
+ newsletters: newsletterCount,  // Detected via List-Unsubscribe header

Phase 2: Register and Route

Estimated scope: Small. Wire up the new agent.

2.1 — Update Mastra config (index.ts) Register the new consul agent alongside existing agents (don't remove old ones yet):

agents: {
  consulAgent,           // NEW single agent
  // Keep existing for iMessage and email orchestrators
  imessageOrchestratorAgent,
  emailOrchestratorAgent,
  // Keep background-only agents
  emailTriageAgent,
  salesAgent,
},

2.2 — Update chat route Modify chat-with-logging.ts to use the new agent for web chat:

// Change agent reference from webOrchestratorAgent to consulAgent
const agent = mastra.getAgent("consul-agent");

2.3 — Update frontend client Update apps/web/lib/mastra/client.ts to point to the new agent ID if needed.

Phase 3: Handle Channel Differences

Estimated scope: Medium. Ensure iMessage and email channels still work.

3.1 — iMessage orchestrator The iMessage orchestrator has unique requirements:

Server-side memory (lastMessages: 15)
sendResponse tool for replying via gateway
One-tool-call-at-a-time for message ordering

Options:

Option A (Recommended): Create imessageConsulAgent that extends the core pattern with iMessage-specific memory and tools. Shares the same ToolSearchProcessor and tool library.
Option B: Keep existing iMessage orchestrator but point it at consolidated tools instead of sub-agents.

// Option A: iMessage variant of the Consul agent
export const imessageConsulAgent = new Agent({
  id: "imessage-consul-agent",
  model: "openai/gpt-4.1-mini",
  instructions: async ({ requestContext }) => {
    return buildInstructions(requestContext, { channel: "imessage" });
  },
  tools: async ({ requestContext }) => ({
    ...coreTools,
    sendResponse: sendResponseTool,
    startScheduleMeeting: startScheduleMeetingTool,
  }),
  memory: new Memory({
    options: {
      lastMessages: 15,
      workingMemory: { enabled: true, scope: "resource" },
    },
  }),
  inputProcessors: [
    new MessageDeduplicator(),
    new ToolSearchProcessor({
      tools: allDiscoverableTools,
      search: { topK: 8, minScore: 0.1 },
    }),
    new ToolCallFilter({ exclude: ["compose-email", "schedule-meeting"] }),
    new TokenLimiter(127000),
  ],
  defaultOptions: {
    autoResumeSuspendedTools: true,
    maxSteps: 8,
  },
});

3.2 — Email orchestrator The email orchestrator handles inbound email triage. This is a specialized flow that may benefit from staying as a focused agent with a subset of tools. Evaluate after web + iMessage are migrated.

Phase 4: Cleanup

Estimated scope: Medium. Remove dead code.

4.1 — Remove old orchestrators Once the new agents are stable:

Delete agents/orchestrator/web-orchestrator-agent.ts
Delete agents/orchestrator/imessage-orchestrator-agent.ts (if Option A)

4.2 — Remove domain agent pairs

Delete agents/gmail/gmail-query-agent.ts and gmail-action-agent.ts
Delete agents/google-calendar/ query/action pair
Delete agents/google-drive/ query/action pair
Delete agents/slack/ query/action pair
Keep: agents/scheduling/, agents/google-contacts/, agents/google-docs/ (if any contain unique logic not in tools)

4.3 — Remove HITL workflows replaced by tool suspend

Delete workflows/hitl/email-action-workflow.ts
Delete workflows/hitl/calendar-action-workflow.ts
Delete workflows/hitl/drive-action-workflow.ts
Delete workflows/hitl/slack-action-workflow.ts
Delete workflows/compose-email-workflow.ts

Keep: All background workflows (daily brief, triage, sales, iMessage send, tag notification).

4.4 — Remove unused utility agents Delete any agent not registered in index.ts:

planningAgent, validatorAgent, analysisAgent (if only used by removed orchestrators)

4.5 — Update Mastra config Remove deleted agents, workflows from index.ts registration.

Phase 5: Polish and Optimize

Estimated scope: Small. Fine-tune after migration.

5.1 — Tune tool descriptions After deploying, monitor which tools get selected incorrectly and refine descriptions. Per Anthropic: "We spent more time optimizing tools than the overall prompt."

5.2 — Tune ToolSearchProcessor parameters

Adjust topK (start with 8, may need more or fewer)
Adjust minScore (start with 0.1, increase if irrelevant tools load)

5.3 — Add EnsureFinalResponseProcessor Prevent empty responses when hitting maxSteps:

new EnsureFinalResponseProcessor(10) // maxSteps = 10

5.4 — Consider observational memory For long conversations, add observational memory to compress old messages:

memory: new Memory({
  options: {
    observationalMemory: {
      enabled: true,
      scope: "resource",
      observation: { messageTokens: 30_000 },
      reflection: { observationTokens: 40_000 },
    },
  },
});

5.5 — KV-Cache optimization Per Manus team's guidance: cached tokens cost 10x less than uncached. Ensure:

System prompt prefix is stable (no timestamps in instructions — pass via tool/context)
Context is append-only where possible
Tool definitions don't change between steps (ToolSearchProcessor handles this)

Migration Strategy

Parallel Deploy (Safe)

Deploy new agent alongside existing agents
Route web chat to new agent, keep iMessage/email on existing
Monitor for 1-2 weeks
Migrate iMessage channel
Clean up old code

Rollback Plan

Keep old orchestrators registered but unused. If the new agent has issues:

Switch chat route back to webOrchestratorAgent (one-line change)
No data migration needed — same tools, same APIs, same storage

Testing Strategy

Tool discovery: Verify ToolSearchProcessor finds correct tools for common queries
HITL flow: Verify suspend/resume works for all write operations
autoResumeSuspendedTools: Verify natural conversation flow for confirmations
Context preservation: Verify working memory persists user preferences
Edge cases: Multi-step requests, error recovery, missing integrations

Files Changed Summary

New Files

agents/consul-agent.ts                    — Main single agent
agents/imessage-consul-agent.ts           — iMessage variant (if Option A)
tools/index.ts                            — Tool registry with core + discoverable

Modified Files

index.ts                                  — Register new agents
routes/chat-with-logging.ts               — Point to new agent
tools/gmail-tools.ts                      — Add suspend to write tools, fix smartInbox
tools/google-calendar-tools.ts            — Add suspend to write tools
tools/google-drive-tools.ts               — Add suspend to write tools
tools/slack-tools.ts                      — Add suspend to sendMessage
tools/compose-email-tool.ts               — Refactor with suspend/resume

Deleted Files (Phase 4)

agents/orchestrator/web-orchestrator-agent.ts
agents/gmail/gmail-query-agent.ts
agents/gmail/gmail-action-agent.ts
agents/google-calendar/google-calendar-query-agent.ts
agents/google-calendar/google-calendar-action-agent.ts
agents/google-drive/google-drive-query-agent.ts
agents/google-drive/google-drive-action-agent.ts
agents/slack/slack-query-agent.ts
agents/slack/slack-action-agent.ts
workflows/hitl/email-action-workflow.ts
workflows/hitl/calendar-action-workflow.ts
workflows/hitl/drive-action-workflow.ts
workflows/hitl/slack-action-workflow.ts
workflows/compose-email-workflow.ts

Unchanged Files

middleware/index.ts                        — Keep entire middleware pipeline
lib/token-resolver.ts                     — Keep three-tier resolution
lib/token-fetcher.ts                      — Keep Supabase fetch
services/*                                — Keep all services
processors/message-deduplicator.ts        — Keep (critical for OpenAI)
workflows/daily-brief-workflow.ts         — Keep (background)
workflows/email-triage-workflow.ts        — Keep (background)
workflows/sales-processing-workflow.ts    — Keep (background)
workflows/imessage-send-workflow.ts       — Keep (gateway)
tools/reminder-tools.ts                   — Keep as core tools
All tool implementations                  — Keep (just add suspend where needed)

Key Risks and Mitigations

Risk	Impact	Mitigation
ToolSearchProcessor returns wrong tools	Medium — agent uses wrong tool	Tune descriptions, adjust topK/minScore, monitor in production
Tool suspend state lost on restart	High — HITL confirmations fail	Ensure LibSQLStore configured for snapshot persistence (already have Turso)
Context window overflow with many tools	Medium — degraded responses	TokenLimiter + ToolSearchProcessor keeps context bounded
Agent makes mistakes without routing guardrails	Medium — wrong actions taken	Tool-level suspend catches dangerous operations before execution
MessageDeduplicator compatibility	High — OpenAI API errors	Port processor as-is, test thoroughly
useChat compatibility	High — web chat breaks	Test that `autoResumeSuspendedTools` works with `@ai-sdk/react`
iMessage channel differences	Medium — different memory needs	Separate iMessage agent variant with its own memory config