How to decompose an agent
Single agent first; escalate only on evidence. A decision procedure for topology, specialist boundaries, tool classification, and cross-cutting concerns.
In Part 1 we established what the orchestration layer is and the five questions it answers every turn. This post is the other half: how do you decide the shape of the system that answers them?
Decomposition is a decision to be documented, not a structure to be assumed. Most teams skip the deciding and jump straight to "let's build a multi-agent system," then spend the next quarter discovering which boundaries they drew wrong. This is the decision procedure that prevents that.
Key takeaways
- Assemble the inputs first: workload, tools, users, constraints, concerns, failure modes. Skipping this is the most common decomposition failure.
- Single agent first. Escalate only when a measurable failure mode fires.
- Router and coordinator are different shapes, not different effort levels. State is the dividing line.
- Cut along permission divergence first; persona and vocabulary alone never justify a split.
- The output is a Decomposition ADR: an artifact you can review, challenge, and revise.
Do the pre-work before you decide
You cannot decompose what you have not inventoried. Before any topology decision, assemble six inputs. This is pre-work, not the decision itself, but skipping it is the single most common way teams get decomposition wrong, because cut lines emerge from this material, not from a whiteboard.
| Input | What to collect | Why it matters |
|---|---|---|
| Workload inventory | Every workflow the agent must support, with a frequency estimate | Cut lines emerge from clustering these, not from the raw list |
| Tool surface | Each tool + permission profile + regulatory weight + blast radius (single vs. bulk) | Permissions drive the strongest split signal |
| User patterns | Who uses it, how often, what their typical interaction shape is | Determines coordinator state burden: pause/resume, compound intent, session length |
| Constraints | Compliance rules, cost limits, latency requirements, audit needs | Determines where cross-cutting concerns are placed |
| Cross-cutting concerns | Rules or behaviors that must apply across multiple workflows | Each one needs a placement |
| Failure modes you care about | Which mistakes are unacceptable; which are recoverable | Shapes where gates go and which concerns get auditor treatment |
Single agent first; the decision flow
Start from the heuristic that closed Part 1: single agent first; escalate only on evidence. This is not an aesthetic preference. It is the published recommendation of both major labs. OpenAI's Practical Guide to Building Agents says to "maximize a single agent's capabilities first." Anthropic's Building Effective Agents says to "find the simplest solution possible, and only increase complexity when needed." Move because a measurable failure mode fires, not because multi-agent sounds sophisticated.
The flow is a single gate followed by a triage:
If the flow terminates at single-agent, you are done choosing topology, but section #6 (cross-cutting concerns) and #7 (tool taxonomy) still apply. They are topology-independent. Part 3 walks a full single-agent decision end to end.
The practical topology menu for chat-bound agents is short:
- Single agent. One model holds the instruction set; orchestration is implicit.
- Router + specialists. Workflows are independent; no pause/resume, no cross-workflow sequencing. A stateless dispatcher picks one specialist per turn.
- Coordinator + specialists. Workflows pause and resume, compound intents are common, transition gates span specialists. An explicit, stateful coordinator owns Q1 state and Q2 decomposition.
Two more topologies, peer agents with handoffs and plan-then-execute, exist but live mostly in autonomous and research-agent territory, outside this series' chat-bound scope.
Router vs. coordinator: state is the dividing line
This is the most common confusion in agent architecture. Both put a central component in front of specialists. The difference is state, and it is not a difference of effort, it is a difference of shape.
- Router: stateless. Picks a specialist per turn. No memory of paused workflows, no compound-intent decomposition, no transition gates.
- Coordinator: stateful. Owns the active workflow, the pause/resume stack, pending gates, and intent decomposition.
The difference is visible in the shape. Both put a central component in front of the same specialists. Only the coordinator owns a place to keep session state, the pause/resume stack and pending gates the router has nowhere to put.
- Router
- Router → Specialist A
- Router → Specialist B
- Router → Specialist C
- Specialist ASpecialist
- Specialist BSpecialist
- Specialist CSpecialist
- Coordinator
- Coordinator → Session state
- Coordinator → Specialist A
- Coordinator → Specialist B
- Coordinator → Specialist C
- State
- Specialist ASpecialist
- Specialist BSpecialist
- Specialist CSpecialist
Walk a few workload traits and watch which one you actually need:
Router or coordinator?
Multi-step workflows that can be interrupted
Compound utterances
Preconditions spanning specialists
If your workload has interruptible workflows, compound utterances, or cross-specialist gates, you need a coordinator. Picking a router because it "sounds simpler" is a category mistake that resurfaces later as state you have to bolt on under pressure.
What "specialist" actually means
The methodology uses "specialist" throughout, and it is worth pinning down, because the word names a role, not an implementation. A specialist is a component that reasons independently about a coherent cluster of work. It is the answer to Q3 ("who acts next?") when the right answer is a domain-owning component rather than a plain tool or the user.
The same conceptual specialist can be built many ways:
Sub-agent: a scoped LLM call with its own prompt and tools
The most common form in coordinator-based topologies. A scoped LLM call with its own system prompt and a subset of tools.
LLM-backed tool ('agent as tool'): a specialist exposed as a callable
A specialist exposed to the coordinator as a callable. This is OpenAI's
Agents SDK as_tool() pattern, where an agent is explicitly registered as a
named tool. Anthropic's subagents are also invoked through a tool, but the
framing differs: Anthropic emphasizes context isolation and delegation (each
subagent runs in a fresh context and returns only its final message), where
OpenAI emphasizes explicit registration. Same mechanism, different design
center.
Graph node: a node in a state graph
A node in a state graph (LangGraph-style), either LLM-backed or deterministic.
Prompt section with scoped tools: single-agent topologies only
In single-agent systems, "specialists" exist only as conceptual sections of one system prompt with role-specific tool access, not as separate components.
The distinction between a specialist and a tool is reasoning. A specialist decides how to do its domain work given user intent and available tools. A tool is a deterministic interface: it takes input and returns a result, even if it is internally LLM-backed. A specialist whose only job is to call one tool with no real reasoning is the tool-equivalent specialist anti-pattern (see #9); collapse it back into a coordinator-called tool.
Why this matters: boundary decisions are upstream of implementation. The same conceptual boundaries should hold whether you build on Pydantic AI, LangGraph, or the Claude Agent SDK. Decide the boundaries framework-agnostically; pick the primitive (sub-agent vs. graph node vs. prompt section) afterward.
Where to cut: the five axes
You do not cut along the raw workflow list. You cut along these axes, in roughly this order of strength. Expand each for the detail.
| Axis | Strength | Cut when |
|---|---|---|
| Strongest | Tools have genuinely different permissions or regulatory weight | |
| Strong | Tools have different shapes (real-time vs. async, transactional vs. lifecycle) | |
| Moderate | Long-running stateful work mixes with single-call transactional work | |
| Moderate | A class of decisions must be inspectable and replayable for compliance | |
| Weakest alone | Genuinely different domain vocabularies, but only alongside a stronger axis |
Two things that do not justify a cut: persona or tone alone (prompt templates with policy variables handle that; persona only counts when paired with axis 1), and one-specialist-per-workflow (specialists are clusters of coherent work, not 1:1 mirrors of the inventory).
Placing cross-cutting concerns
Some concerns do not belong to any single workflow: compliance rules, rate limits, cost controls, audit logging, content moderation, security policies, consent. Every non-trivial agent has them, and each needs a deliberate placement. There are four patterns, and they form a migration path from cheap to expensive.
For each concern, ask in order:
- (1) Is an external system already handling this? → external encapsulation.
- (2) Does it apply to one specialist only? → embedded (specialist's system prompt).
- (3) Does it need deterministic enforcement or cross-specialist visibility? → coordinator policy (enforcement) or auditor (correlation / immutable log).
- (4) Default for v1 → embedded, with the extraction triggers documented.
The migration path matters more than picking the most sophisticated pattern upfront. A concern typically starts embedded, moves to coordinator policy when the same rule shows up in multiple prompts or must change faster than redeploys, and is extracted to an auditor when an immutable trail or cross-specialist correlation becomes a hard requirement. Document the trigger conditions; that is more valuable than over-building on day one.
In single-agent systems the four patterns still apply. Embedded and coordinator-policy simply collapse into "in the system prompt," but the choice between prompt placement, auditor, and external encapsulation is real. Part 3 shows these three appearing in one single-agent system.
Tool taxonomy: Type 1 vs. Type 2
Not all tools are the same shape, and the difference drives where state belongs. Recognize the two classes early: most consequential external state in production agents lives behind the second one.
Single call, immediate result, no external lifecycle. Send-SMS, query-database, get-current-time. The agent gets a result and moves on; nothing keeps living after the call returns.
A single call returns a reference to long-lived external state. The external system owns the lifecycle, state, and operational complexity; the agent sees a clean tool surface. A job-posting platform managing many ad networks; a background-check vendor running a multi-day verification; a calendar with event lifecycles.
Type 2 tools have three consequences for decomposition.
- Specialists should call them but not replicate their lifecycle logic; a specialist accumulating state that mirrors an external system is a sign the tool should be Type 2, not that the specialist needs more state.
- When action on one Type 2 resource implies action on another (close parent X → close child resources Y on a different platform), that orchestration belongs at the coordinator, not in either specialist.
- And Type 2 tools fail in state-dependent ways (platform down, resource expired, budget exceeded mid-run) that specialists must surface gracefully without owning the underlying lifecycle.
Tool taxonomy is topology-independent: it applies whether you build single-agent or multi-agent.
A worked example: when the evidence forces a split
A methodology is easiest to trust when you watch it run. Here it is on a domain where the evidence genuinely forces a split: an ATS recruiting agent, the kind of system a recruiter uses to hire software engineers and other knowledge workers. Part 3 runs the same procedure on a domain where it does not split. The contrast is the point.
Start with the pre-work from #1. Expand each input to see the domain.
Workload inventory: ~15 workflows
- Requisition management: open, draft, edit job reqs
- Sourcing: advertise across job boards, SMS reactivation of past applicants, referrals
- Screening: review applications, knockout questions, ranking
- Pipeline management: advance or reject candidates through stages, single and bulk
- Interview scheduling
- Background checks
- Offer letters
- Pipeline analytics
The vocabularies differ sharply across clusters: acquisition, interview logistics, legal compliance.
Tool surface: 6 tools, 3 of them Type 2
- ATS database: R/W, candidate PII, bulk-write with high blast radius
- Job-board / advertising platform: write, cost-controlled, externally managed (Type 2)
- SMS provider: write, TCPA-regulated
- Calendar: R/W, event lifecycles (Type 2)
- Background-check vendor: write, FCRA-regulated, legal weight (Type 2)
- Analytics service: read-only
User patterns: recruiter, multi-step, bulk
- A recruiter managing many open roles at once
- Long sessions, multi-step workflows
- Frequent bulk actions
- Compound utterances ("advance these three to onsite, schedule them with the platform team, but skip anyone already in another loop")
- Mid-conversation pivots are common
Constraints: FCRA, TCPA, EEOC, cost, PII
- FCRA: background-check disclosure and adverse-action procedure
- TCPA: SMS consent and opt-out
- EEOC and anti-discrimination law: disposition decisions must be defensible and auditable
- Cost: advertising-spend controls
- PII: candidate data privacy
Now run the same five limits from Part 1 that the single-agent flow checks. This time the evidence is here.
| Limit | Fires? | Evidence |
|---|---|---|
| Permissions / policy divergence | Yes, strongly | FCRA background checks and offer letters (write, legal consequence), TCPA-regulated SMS, bulk ATS mutation, and read-only analytics are genuinely different permission and regulatory surfaces. This one justifies the split on its own. |
| Instruction-set bloat | Yes | 15+ workflows with sharply different vocabularies (sourcing channels, interview logistics, compliance language) strain a single coherent prompt. |
| Context saturation | Moderate | A popular role draws a large candidate pool; reasoning over many candidate records for a bulk action loads heavy context. |
| Parallel work | Yes | Posting one role to several boards at once, checking pipeline status across many open roles, and running background checks on several finalists are genuinely parallel. |
| Trace legibility | Needed | Bulk disposition affecting many candidates needs explicit, auditable routing, especially under anti-discrimination scrutiny. |
Multiple limits fire, and permissions divergence alone is decisive: a read-only analytics query and an FCRA-regulated background check cannot safely share an agent regardless of prompt quality. The framework escalates off single-agent.
Topology: coordinator, not router
The split is real, so the next question is the shape. State is the dividing line from earlier in this post, and three traits of this workload each demand it:
- Pause and resume. A recruiter starts a bulk advance, pauses to inspect one candidate, then resumes. That needs a pause/resume stack.
- Compound utterances. "Advance these, schedule them, skip conflicts" is one message with several ordered sub-intents, which is Q2 decomposition.
- Transition gates across specialists. No offer goes out before screening is complete and the background check has cleared, a precondition that spans specialists.
All three are coordinator territory. A flat router would lose the workflow state the moment the recruiter pivoted.
Specialist boundaries
The cut lines come from the axes in #5, permissions first.
| Specialist | Owns | Why it is its own boundary |
|---|---|---|
| Coordinator | Q1 state, Q2 decomposition, bulk-action confirmations, transition-gate policy, cross-tool consequences | Orchestration, not domain work. The stateful burden lives here. |
| Hiring Gate | Background checks, offer letters | Axis 1, the strongest: FCRA and legal weight isolate this surface. Axis 4: it is the cleanest audit boundary. |
| Sourcing | Job advertising, SMS reactivation, referrals | Axis 1: the TCPA-regulated SMS surface is distinct. Axis 2: acquisition is one coherent tool surface. |
| Pipeline & Scheduling | Disposition (single and bulk), interview scheduling | Axis 2: advancing a candidate and booking the interview are one tightly coupled tool surface. |
| Requisition | Create, draft, edit, look up job reqs | Axis 2: a coherent tool surface, the ATS requisition tables. |
| Analytics | Reporting, pipeline metrics | Borderline. Read-only, so lean toward a coordinator-called tool unless it must reason about which metric answers the question, else it is the tool-equivalent-specialist anti-pattern. |
The shape that falls out is one coordinator owning state and orchestration, over five specialists clustered by the axes above, permissions first.
- Coordinator
- Coordinator → Hiring Gate
- Coordinator → Sourcing
- Coordinator → Pipeline
- Coordinator → Requisition
- Coordinator → Analytics
- Specialist
- Specialist
- Specialist
- Specialist
- Specialist
Concerns and tools
Cross-cutting concerns land across three of the four placement patterns:
- Embedded: FCRA in the Hiring Gate prompt, TCPA in Sourcing. Both are specialist-specific and stable.
- Coordinator-owned: bulk-action confirmations and transition gates. They cross specialists and need deterministic enforcement.
- Auditor, deferred: EEOC adverse-action correlation. Stand it up only when an immutable, cross-specialist audit trail becomes a hard requirement, not before.
On tools, the advertising platform, the calendar, and the background-check vendor are Type 2: each owns long-lived external state behind a clean tool surface. That is exactly why "close a requisition, then take down all of its job postings" belongs at the coordinator. It is a consequence that spans two Type 2 platforms, and no single specialist should own it.
Write this up and you already have most of a Decomposition ADR. The next section gives it a shape.
The output: a Decomposition ADR
The methodology produces an artifact, not a vibe: a Decomposition ADR you can review, challenge, and revise as the system evolves. The point is not to be right on day one; it is to make the decisions explicit so future revisions can challenge them. A defensible ADR contains eleven things:
1 · Workload inventory
Every workflow with frequency and criticality.
2 · Tool surface
Each tool with permission profile, regulatory weight, blast radius, and Type 1 / Type 2 classification.
3 · User-pattern summary
Interaction shape, session length, compound-intent likelihood, pause/resume burden.
4 · Constraints
Compliance, cost, latency, audit.
5 · Limit evaluation
Which of the Part 1 limits fire, with evidence.
6 · Topology choice
Single agent / router / coordinator+specialists / other, with alternatives considered and rejected.
7 · Specialist boundaries
Each specialist with owned workflows, tools, and the axes that justify the boundary.
8 · Coordinator responsibilities
If applicable: state model, gates, sequencing rules.
9 · Cross-cutting concern placement
Each concern with placement and rationale.
10 · Migration triggers
Conditions under which placements should be revisited.
11 · Open questions
Design judgments that need empirical validation.
And the failure modes the ADR is meant to prevent, the anti-patterns that show up again and again:
| Anti-pattern | What it looks like | Why it is wrong |
|---|---|---|
| Splitting on persona alone | Two specialists, identical tools, different tone | Persona alone never justifies a split |
| Specialist proliferation | One specialist per workflow in the inventory | Specialists are clusters, not mirrors |
| Coordinator as god object | Business logic and content generation inside the coordinator | The coordinator owns orchestration, not domain work |
| Hidden state in specialists | A specialist mirrors an external system as a state machine | That state belongs in a Type 2 tool |
| Flat router for stateful workload | Router chosen "because simpler," then state bolted on | Router and coordinator are different shapes, not effort levels |
| Skipping the input phase | Jumping to topology before inventory and constraints | Pre-work is non-optional |
The decision in one breath
- Inputs before topology: workload, tools, users, constraints, concerns, failures.
- Single agent until a named limit fires; permissions divergence is the strongest signal.
- Coordinator when state crosses turns or specialists; router only when it genuinely does not.
- Specialists are clusters that reason; tools are interfaces. Type 2 tools own external lifecycle, not specialists.
- Write the ADR. Document migration triggers. Revise on evidence.
Next: Part 3 — When a single agent wins, where we run this whole procedure on a consumer ecommerce support agent and watch it land, correctly, on a single agent.
References
- OpenAI, A Practical Guide to Building Agents
- Anthropic, Building Effective Agents
- OpenAI Agents SDK, Agents as tools
- Claude Agent SDK, Subagents