Consul Agent System Architecture

Comprehensive Technical Documentation

This document provides a detailed overview of the Consul Agent AI system architecture, including agent hierarchy, workflows, tools, memory patterns, and observability.

System Overview
High-Level Architecture
Agent Hierarchy
Request Flow
Token Resolution
Memory Architecture
Workflow Patterns
Tool Categories
Observability & Billing
Security Model

System Overview

Consul Agent is a multi-channel AI assistant that orchestrates across email, calendar, messaging, and productivity tools. The system is built on Mastra, a TypeScript framework for building AI agents.

Key Characteristics

Aspect	Implementation
Channels	Web Chat, Email (AgentMail), iMessage/SMS (Photon)
Agent Pattern	Hierarchical Orchestrators → Domain Agents
Safety Model	HITL (Human-in-the-Loop) for all write operations
Storage	Supabase (user data) + Turso (AI spans/memory)
Observability	Token billing, trace registration, Mastra Cloud

High-Level Architecture

Agent Hierarchy

The agent system follows a hierarchical delegation pattern where orchestrators route requests to specialized domain agents.

Agent Roles

Agent Type	Purpose	Example Agents
Orchestrators	Route requests to specialists	Web, Email, iMessage
Query Agents	Read-only operations	Gmail Query, Calendar Query
Action Agents	Write operations (HITL protected)	Gmail Action, Calendar Action
Unified Agents	Combined read/write (low-risk)	Contacts, Docs, Reminders
Specialized	Domain-specific logic	Sales, Triage, Scheduling

Request Flow

Web Chat Request Flow

iMessage/SMS Request Flow

Token Resolution

OAuth tokens are resolved using a three-tier strategy that balances performance with reliability.

Token Resolution Code Pattern

import { resolveGoogleToken } from "../lib/token-resolver";

const accessToken = await resolveGoogleToken(
  "gmail",                    // Service: gmail | google_calendar | google_drive
  inputData.accessToken,      // Tier 1: Direct input
  context?.requestContext     // Tier 2/3: Context or Supabase fetch
);

Memory Architecture

Memory configuration varies by channel to prevent errors and optimize token usage.

Memory ID Strategy

Input Processors

All orchestrators use standard processors to prevent errors:

Workflow Patterns

HITL (Human-in-the-Loop) Workflow

All write operations require user approval through the HITL pattern.

Conversational Workflow (Compose Email)

Multi-turn workflows preserve state across suspend/resume cycles.

Daily Brief Workflow

Inngest-Based Scheduling

Long-running workflows use Inngest for durability.

Tool Categories

Tool Organization

Tool Execution Pattern

Observability & Billing

Token Billing Architecture

Consumption Event Schema

Three-Exporter System

Security Model

Request Context Whitelisting

HITL Protection Matrix

Token Encryption Flow

File Organization

apps/agents/src/mastra/
├── index.ts                    # Mastra config, middleware, exports
├── agents/                     # 23 agents
│   ├── orchestrator/           # Web, Email, iMessage orchestrators
│   ├── gmail/                  # Query + Action agents
│   ├── google-calendar/        # Query + Action agents
│   ├── google-drive/           # Query + Action agents
│   ├── slack/                  # Query + Action agents
│   ├── scheduling/             # Scheduling agent
│   └── ...                     # Other domain agents
├── tools/                      # 30+ tools
│   ├── gmail-tools.ts
│   ├── google-calendar-tools.ts
│   ├── scheduling/             # 8 scheduling tools
│   └── ...
├── workflows/                  # 11 workflows
│   ├── email-triage-workflow.ts
│   ├── daily-brief/
│   ├── hitl/                   # 4 HITL workflows
│   └── inngest/                # Durable scheduling
├── services/                   # 32 services
│   ├── token-billing/          # Billing exporter
│   ├── supabase.ts
│   ├── turso.ts
│   └── ...
├── lib/                        # Utilities
│   ├── token-resolver.ts
│   ├── token-fetcher.ts
│   └── ...
├── routes/                     # Custom API routes
│   └── chat-with-logging.ts
├── processors/                 # Custom processors
│   └── message-deduplicator.ts
└── types/                      # TypeScript types
    └── request-context-types.ts

Key Architectural Decisions

Decision	Rationale
Query/Action Agent Split	Separates read (fast, no approval) from write (HITL protected)
Three-Tier Token Resolution	Reduces DB queries via caching while ensuring reliability
Channel-Specific Memory	Prevents duplicate errors (web) while maintaining context (iMessage)
HITL for All Writes	User always approves mutations before execution
Inngest for Long Workflows	Durability for scheduling workflows that span hours/days
Agent Network Routing	LLM-driven routing more resilient than instruction-based
Observable Token Billing	Every LLM call traced to user for accurate billing

Last updated: January 2025