Wiki AI Learning

Research Infrastructure AI Learning Platform

pattern

6 Agent Design Patterns

Why This Matters

Without shared patterns, agent systems collapse into technical debt fast. Teams routinely prototype 30+ custom agent designs before realizing that almost every production use case fits one of six fundamental architectures. These six patterns are composable: complex systems are built by combining them, not by inventing new ones.

Pattern Decision Guide

Before picking a pattern, answer two questions:

Is execution order fixed or dynamic? Fixed → Sequential or Parallel. Dynamic → Coordinator or Agent-as-Tool.
Does quality need a gate? Yes → Loop & Critique. No → any of the others.

If the task is simple enough for one LLM call, start with Single before adding complexity.

Pattern 1: Single Agent

One LLM. One tool loop. One output.

User Query → Agent (+ tools) → Response

The agent has direct access to all tools it needs and handles the task from start to finish. This is the fastest pattern to build and easiest to debug.

When to use:

The task fits in one context window
Tool usage is limited and predictable
Latency requirements are strict

Watch out for:

Prompt bloat: adding more tools and instructions to one agent degrades performance
Tool-order drift: the agent starts calling tools in the wrong sequence as the prompt grows

Signal to move on: when the system prompt exceeds ~2,000 tokens or tool call errors increase, migrate to Sequential.

Pattern 2: Sequential Agent

A deterministic, ordered pipeline of specialist agents.

Orchestrator → Agent A (Plan) → Agent B (Execute) → Agent C (Review) → Output

Each agent performs one stage and passes structured output to the next. The orchestrator coordinates the pipeline but does not decide at runtime which stage runs — the order is fixed.

When to use:

The task has well-defined, dependent stages
Each stage needs a different system prompt or tool set
Auditability and traceability are required (every stage is logged independently)

Trade-offs:

Pro	Con
Predictable, step-by-step execution	Higher latency (stages run sequentially)
Each stage testable in isolation	No branching — can't skip stages dynamically
Easy to debug (inspect each handoff)	Brittle if a stage produces unexpected output

Amprealize example: The GEP runs 8 phases in sequence — intake, behavior retrieval, context composition, generation, review, and delivery.

Pattern 3: Parallel (Fan-out / Synthesizer)

Independent sub-tasks run concurrently; a synthesizer merges the results.

                ┌── Agent A (task 1) ──┐
Orchestrator ──▶├── Agent B (task 2) ──┤──▶ Synthesizer ──▶ Output
                └── Agent C (task 3) ──┘

The orchestrator fans work out to agents that can run in parallel. The synthesizer waits for all results and merges them.

When to use:

Sub-tasks are genuinely independent (no data dependency between them)
Latency reduction is a priority
The final step requires combining multiple perspectives or data sources

Trade-offs:

Pro	Con
Significant latency reduction	Partial failure handling is complex
Scales well with the number of sub-tasks	Cost scales with parallelism
Natural for research, comparison, and aggregation tasks	Synthesizer prompt can become complex

Partial failure strategy: decide whether to fail fast (abort on any error), fail soft (synthesize with available results), or retry individual branches.

Pattern 4: Coordinator

An LLM-driven dispatch layer dynamically routes tasks to specialist agents.

User Query → Coordinator (LLM decides routing) → Agent A or B or C → Output

Unlike Sequential (fixed order) or Parallel (all run), the Coordinator reads the task at runtime and decides which specialist agent(s) to invoke. The routing decision itself is an LLM call.

When to use:

The task type is not known at design time
A growing catalog of specialist agents handles different domains
You need to add new agent types without changing the routing code

Trade-offs:

Pro	Con
Highly flexible — new agents slot in without refactoring	Routing is non-deterministic (LLM can mis-route)
Scales to large agent catalogs	One extra LLM call per routing decision
Natural fit for user-facing chat systems	Harder to test — you must test routing logic separately

Key implementation detail: each specialist agent's description must be precise. The Coordinator's routing quality is only as good as the descriptions it reads. Vague descriptions → mis-routing.

Pattern 5: Agent-as-Tool

Specialist sub-agents are exposed as callable tools to a primary orchestrator.

Primary Agent ──▶ tool: research_agent() ──▶ Research Sub-Agent
               ──▶ tool: summarizer_agent() ──▶ Summarizer Sub-Agent
               ──▶ tool: validator_agent() ──▶ Validator Sub-Agent

Instead of delegating to an autonomous agent (as in Coordinator), the primary agent calls sub-agents the same way it calls any other tool — with a defined input schema and a structured return value. The primary agent retains full control of synthesis and the final response.

When to use:

The primary agent must maintain ownership of the final output
Sub-agents perform information gathering or transformation, not autonomous decision-making
You want MCP-compatible composition (sub-agents can be exposed as MCP tools)

Trade-offs:

Pro	Con
Primary agent retains full context and control	Sub-agents cannot act autonomously
Sub-agents are testable as pure functions	Sub-agent output feeds back into primary context (token cost)
Clean separation of concerns	Primary agent's context window fills faster with sub-agent results

MCP connection: this pattern pairs naturally with MCP — sub-agents can be wrapped as MCP tools and discovered dynamically, rather than being hardcoded into the primary agent's tool list.

Pattern 6: Loop & Critique

A generator produces output; a critic evaluates it; the loop repeats until a quality gate passes.

                  ┌────────────────────────────────────┐
                  │                                    │
User Query → Generator Agent → Critic Agent → [pass?] ──▶ Output
                                                  │
                                              [fail] ──▶ Generator (with feedback)

The critic has its own system prompt and evaluates the generator's output against explicit criteria. If the output fails, the critic's feedback is added to the generator's context for the next attempt. The loop exits when the critic approves or a maximum iteration count is reached.

When to use:

Output quality is non-negotiable (compliance text, medical content, legal summaries)
Hallucination risk is high and must be caught automatically
The domain has clear, expressible quality criteria

Trade-offs:

Pro	Con
Catches errors that single-pass generation misses	Token cost scales with iterations
Critic feedback improves generator context on each pass	Latency scales with iterations
Self-validating — reduces need for human review	Risk of infinite loops — always set a max iteration cap

Implementation tips:

Keep the critic's criteria explicit and enumerable (not "is this good?" but "does it answer the question? is every claim cited?")
Log every iteration for debugging
Set max_iterations = 3 as a default; raise only with measured evidence

Composing Patterns

Complex production systems combine patterns. Common compositions:

System type	Composition
Research assistant	Coordinator → Parallel (gather) → Synthesizer
Content pipeline	Sequential (draft → edit) with Loop & Critique on the edit stage
Customer support	Coordinator (route to domain) → Agent-as-Tool (retrieve + respond)
Amprealize GEP	Sequential (8 phases) + internal Coordinator (agent routing)

The "seventh pattern test": if you think you need a new pattern, try to express it as a composition of these six first. Production experience suggests you almost always can.

6 Agent Design Patterns

Why This Matters

Pattern Decision Guide

Pattern 1: Single Agent

Pattern 2: Sequential Agent

Pattern 3: Parallel (Fan-out / Synthesizer)

Pattern 4: Coordinator

Pattern 5: Agent-as-Tool

Pattern 6: Loop & Critique

Composing Patterns

See Also

Request early access