Consul Agent System Architecture
Comprehensive Technical Documentation
This document provides a detailed overview of the Consul Agent AI system architecture, including agent hierarchy, workflows, tools, memory patterns, and observability.
Table of Contents
- System Overview
- High-Level Architecture
- Agent Hierarchy
- Request Flow
- Token Resolution
- Memory Architecture
- Workflow Patterns
- Tool Categories
- Observability & Billing
- Security Model
System Overview
Consul Agent is a multi-channel AI assistant that orchestrates across email, calendar, messaging, and productivity tools. The system is built on Mastra, a TypeScript framework for building AI agents.
Key Characteristics
| Aspect | Implementation |
|---|---|
| Channels | Web Chat, Email (AgentMail), iMessage/SMS (Photon) |
| Agent Pattern | Hierarchical Orchestrators → Domain Agents |
| Safety Model | HITL (Human-in-the-Loop) for all write operations |
| Storage | Supabase (user data) + Turso (AI spans/memory) |
| Observability | Token billing, trace registration, Mastra Cloud |
High-Level Architecture
Agent Hierarchy
The agent system follows a hierarchical delegation pattern where orchestrators route requests to specialized domain agents.
Agent Roles
| Agent Type | Purpose | Example Agents |
|---|---|---|
| Orchestrators | Route requests to specialists | Web, Email, iMessage |
| Query Agents | Read-only operations | Gmail Query, Calendar Query |
| Action Agents | Write operations (HITL protected) | Gmail Action, Calendar Action |
| Unified Agents | Combined read/write (low-risk) | Contacts, Docs, Reminders |
| Specialized | Domain-specific logic | Sales, Triage, Scheduling |
Request Flow
Web Chat Request Flow
iMessage/SMS Request Flow
Token Resolution
OAuth tokens are resolved using a three-tier strategy that balances performance with reliability.
Token Resolution Code Pattern
1import { resolveGoogleToken } from "../lib/token-resolver";
2
3const accessToken = await resolveGoogleToken(
4 "gmail", // Service: gmail | google_calendar | google_drive
5 inputData.accessToken, // Tier 1: Direct input
6 context?.requestContext // Tier 2/3: Context or Supabase fetch
7);Memory Architecture
Memory configuration varies by channel to prevent errors and optimize token usage.
Memory ID Strategy
Input Processors
All orchestrators use standard processors to prevent errors:
Workflow Patterns
HITL (Human-in-the-Loop) Workflow
All write operations require user approval through the HITL pattern.
Conversational Workflow (Compose Email)
Multi-turn workflows preserve state across suspend/resume cycles.
Daily Brief Workflow
Inngest-Based Scheduling
Long-running workflows use Inngest for durability.
Tool Categories
Tool Organization
Tool Execution Pattern
Observability & Billing
Token Billing Architecture
Consumption Event Schema
Three-Exporter System
Security Model
Request Context Whitelisting
HITL Protection Matrix
Token Encryption Flow
File Organization
1apps/agents/src/mastra/
2├── index.ts # Mastra config, middleware, exports
3├── agents/ # 23 agents
4│ ├── orchestrator/ # Web, Email, iMessage orchestrators
5│ ├── gmail/ # Query + Action agents
6│ ├── google-calendar/ # Query + Action agents
7│ ├── google-drive/ # Query + Action agents
8│ ├── slack/ # Query + Action agents
9│ ├── scheduling/ # Scheduling agent
10│ └── ... # Other domain agents
11├── tools/ # 30+ tools
12│ ├── gmail-tools.ts
13│ ├── google-calendar-tools.ts
14│ ├── scheduling/ # 8 scheduling tools
15│ └── ...
16├── workflows/ # 11 workflows
17│ ├── email-triage-workflow.ts
18│ ├── daily-brief/
19│ ├── hitl/ # 4 HITL workflows
20│ └── inngest/ # Durable scheduling
21├── services/ # 32 services
22│ ├── token-billing/ # Billing exporter
23│ ├── supabase.ts
24│ ├── turso.ts
25│ └── ...
26├── lib/ # Utilities
27│ ├── token-resolver.ts
28│ ├── token-fetcher.ts
29│ └── ...
30├── routes/ # Custom API routes
31│ └── chat-with-logging.ts
32├── processors/ # Custom processors
33│ └── message-deduplicator.ts
34└── types/ # TypeScript types
35 └── request-context-types.tsKey Architectural Decisions
| Decision | Rationale |
|---|---|
| Query/Action Agent Split | Separates read (fast, no approval) from write (HITL protected) |
| Three-Tier Token Resolution | Reduces DB queries via caching while ensuring reliability |
| Channel-Specific Memory | Prevents duplicate errors (web) while maintaining context (iMessage) |
| HITL for All Writes | User always approves mutations before execution |
| Inngest for Long Workflows | Durability for scheduling workflows that span hours/days |
| Agent Network Routing | LLM-driven routing more resilient than instruction-based |
| Observable Token Billing | Every LLM call traced to user for accurate billing |
Last updated: January 2025