How to decompose an agent

In Part 1 we established what the orchestration layer is and the five questions it answers every turn. This post is the other half: how do you decide the shape of the system that answers them?

Decomposition is a decision to be documented, not a structure to be assumed. Most teams skip the deciding and jump straight to "let's build a multi-agent system," then spend the next quarter discovering which boundaries they drew wrong. This is the decision procedure that prevents that.

Key takeaways

Assemble the inputs first: workload, tools, users, constraints, concerns, failure modes. Skipping this is the most common decomposition failure.
Single agent first. Escalate only when a measurable failure mode fires.
Router and coordinator are different shapes: a router dispatches, while a coordinator owns cross-workflow state, decomposition, sequencing, and gates.
Cut along capability and policy divergence first; enforce those boundaries in tools and APIs, not prompts.
The output is a Decomposition ADR: an artifact you can review, challenge, and revise.

01PRE-WORK

Do the pre-work before you decide

You cannot decompose what you have not inventoried. Before any topology decision, assemble six inputs. This is pre-work, not the decision itself, but skipping it is the single most common way teams get decomposition wrong, because cut lines emerge from this material, not from a whiteboard.

Input	What to collect	Why it matters
Workload inventory	Every workflow the agent must support, with a frequency estimate	Cut lines emerge from clustering these, not from the raw list
Tool surface	Each tool + permission profile + regulatory weight + blast radius (single vs. bulk)	Permissions drive the strongest split signal
User patterns	Who uses it, how often, what their typical interaction shape is	Determines coordinator state burden: pause/resume, compound intent, session length
Constraints	Compliance rules, cost limits, latency requirements, audit needs	Determines where cross-cutting concerns are placed
Cross-cutting concerns	Rules or behaviors that must apply across multiple workflows	Each one needs a placement
Failure modes you care about	Which mistakes are unacceptable; which are recoverable	Shapes where gates go and which concerns get auditor treatment

02THE DECISION FLOW

Single agent first; the decision flow

Start from the heuristic that closed Part 1: single agent first; escalate only on evidence. This is not an aesthetic preference. It is the published recommendation of both major labs. OpenAI's Practical Guide to Building Agents says to "maximize a single agent's capabilities first." Anthropic's Building Effective Agents says to "find the simplest solution possible, and only increase complexity when needed." Move because a measurable failure mode fires, not because multi-agent sounds sophisticated.

The flow is a single gate followed by a triage:

If the flow terminates at single-agent, you are done choosing topology, but section #6 (cross-cutting concerns) and #7 (tool taxonomy) still apply. They are topology-independent. Part 3 walks a full single-agent decision end to end.

The practical topology menu for chat-bound agents is short:

Single agent. One model holds the instruction set; orchestration is implicit.
Router + specialists. Workflows are independent; a lightweight dispatcher picks one specialist per turn but does not own cross-workflow state or sequencing.
Coordinator + specialists. Workflows pause and resume, compound intents are common, transition gates span specialists. An explicit, stateful coordinator owns Q1 state, Q2 decomposition, sequencing, and gates.

Two more topologies, peer agents with handoffs and plan-then-execute, exist but live mostly in autonomous and research-agent territory, outside this series' chat-bound scope.

03ROUTER VS COORDINATOR

Router vs. coordinator: state ownership is the dividing line

This is the most common confusion in agent architecture. Both put a central component in front of specialists. Both may read session context; the difference is whether the component merely dispatches or owns and evolves cross-workflow state. It is not a difference of effort, it is a difference of shape.

Router: dispatches a turn to a specialist. It may inspect routing context, but it does not own paused workflows, intent decomposition, or transition gates.
Coordinator: owns the active workflow, the pause/resume stack, pending gates, intent decomposition, and sequencing across specialists.

The difference is visible in the shape. Both put a central component in front of the same specialists. Only the coordinator is responsible for mutating the cross-workflow state that keeps the system coherent across turns.

Router: a lightweight dispatcher. It picks one specialist per turn but does not own cross-workflow state. Tap a node for detail.

Coordinator: same central position over the same specialists, but it owns and evolves cross-workflow state.

Walk a few workload traits and watch which one you actually need:

Router or coordinator?

Multi-step workflows that can be interrupted

Compound utterances

Preconditions spanning specialists

ALLOWNone of the cross-workflow ownership needs fire, so a lightweight router is sufficient. Do not over-engineer.

If your workload has interruptible workflows, compound utterances, or cross-specialist gates, you need a coordinator to own that work. Calling the component a router does not remove the responsibility; it only hides coordinator behavior under a narrower name.

04WHAT IS A SPECIALIST

What "specialist" actually means

The methodology uses "specialist" throughout, and it is worth pinning down, because the word names a role, not an implementation. A specialist is a component that reasons independently about a coherent cluster of work. It is the answer to Q3 ("who acts next?") when the right answer is a domain-owning component rather than a plain tool or the user.

The same conceptual specialist can be built many ways:

Sub-agent: a scoped LLM call with its own prompt and tools

The most common form in coordinator-based topologies. A scoped LLM call with its own system prompt and a subset of tools.

LLM-backed tool ('agent as tool'): a specialist exposed as a callable

A specialist exposed to the coordinator as a callable. This is OpenAI's Agents SDK as_tool() pattern, where an agent is explicitly registered as a named tool. Anthropic's subagents are also invoked through a tool, but the framing differs: Anthropic emphasizes context isolation and delegation (each subagent runs in a fresh context and returns only its final message), where OpenAI emphasizes explicit registration. Same mechanism, different design center.

Graph node: a node in a state graph

A node in a state graph (LangGraph-style), either LLM-backed or deterministic.

Prompt section with scoped tools: single-agent topologies only

In single-agent systems, "specialists" exist only as conceptual sections of one system prompt with role-specific tool access, not as separate components.

The distinction between a specialist and a tool is ownership of the task. A specialist decides how to pursue a bounded user goal with its available tools. A tool exposes a callable contract: it takes structured input and returns a result, even if its implementation is internally LLM-backed. A specialist whose only job is to call one tool with no real reasoning is the tool-equivalent specialist anti-pattern (see #9); collapse it back into a coordinator-called tool.

Why this matters: boundary decisions are upstream of implementation. The same conceptual boundaries should hold whether you build on Pydantic AI, LangGraph, or the Claude Agent SDK. Decide the boundaries framework-agnostically; pick the primitive (sub-agent vs. graph node vs. prompt section) afterward.

05WHERE TO CUT

Where to cut: the five axes

You do not cut along the raw workflow list. You cut along these axes, in roughly this order of strength. Expand each for the detail.

Axis	Strength	Cut when
	Strongest	Tools need different capabilities, approvals, or regulatory handling
	Strong	Tools have different shapes (real-time vs. async, transactional vs. lifecycle)
	Moderate	Long-running stateful work mixes with single-call transactional work
	Moderate	A class of decisions must be inspectable and replayable for compliance
	Weakest alone	Genuinely different domain vocabularies, but only alongside a stronger axis

Two things that do not justify a cut: persona or tone alone (prompt templates with policy variables handle that; persona only counts when paired with axis 1), and one-specialist-per-workflow (specialists are clusters of coherent work, not 1:1 mirrors of the inventory).

06CROSS-CUTTING CONCERNS

Placing cross-cutting concerns

Some concerns do not belong to any single workflow: compliance rules, rate limits, cost controls, audit logging, content moderation, security policies, consent. Every non-trivial agent has them, and each needs a deliberate placement. Four patterns cover that placement decision.

Placement answers where guidance, configuration, and checks live. It does not replace enforcement. Any rule protecting identity, money, regulated actions, or irreversible side effects needs a deterministic control at the tool, API, policy middleware, or approval boundary, even when the same rule is explained in a prompt.

For each concern, ask in order:

(1) Is an external system already handling this? → external encapsulation.
(2) Is it behavioral guidance for one specialist? → embedded in that specialist's prompt.
(3) Does it control access, money, regulated actions, or irreversible effects? → deterministic policy at the harness, tool, API, or approval boundary.
(4) Does it need cross-specialist correlation or an immutable trail? → auditor.

The migration path matters more than picking the most sophisticated pattern upfront. Advisory behavior can start embedded and move to centralized policy when it spans specialists or changes faster than prompts. Hard constraints start at a deterministic boundary; they do not wait for a later migration. Add an auditor when immutable history or cross-specialist correlation becomes a hard requirement. Document the trigger conditions; that is more valuable than over-building on day one.

In single-agent systems the four patterns still apply. Behavioral guidance can live in the prompt, while deterministic policy can live in the harness, middleware, approval gate, or tool boundary without requiring another LLM agent. Part 3 shows those placements in one single-agent system.

07TOOL TAXONOMY

Tool taxonomy: Type 1 vs. Type 2

Not all tools are the same shape, and the difference drives where state belongs. Recognize the two classes early: most consequential external state in production agents lives behind the second one.

Type 1: Transactional

Single call, immediate result, no external lifecycle. Send-SMS, query-database, get-current-time. The agent gets a result and moves on; nothing keeps living after the call returns.

Type 2: Managed-resource platform

A single call returns a reference to long-lived external state. The external system owns the lifecycle, state, and operational complexity; the agent sees a clean tool surface. A job-posting platform managing many ad networks; a background-check vendor running a multi-day verification; a calendar with event lifecycles.

Type 2 tools have three consequences for decomposition.

Specialists should call them but not replicate their lifecycle logic; a specialist accumulating state that mirrors an external system is a sign the tool should be Type 2, not that the specialist needs more state.
When action on one Type 2 resource implies action on another (close parent X → close child resources Y on a different platform), that orchestration belongs at the coordinator, not in either specialist.
And Type 2 tools fail in state-dependent ways (platform down, resource expired, budget exceeded mid-run) that specialists must surface gracefully without owning the underlying lifecycle.

Tool taxonomy is topology-independent: it applies whether you build single-agent or multi-agent.

08WORKED EXAMPLE

A worked example: when the evidence forces a split

A methodology is easiest to trust when you watch it run. Here it is on a domain where the evidence genuinely forces a split: an ATS recruiting agent, the kind of system a recruiter uses to hire software engineers and other knowledge workers. Part 3 runs the same procedure on a domain where it does not split. The contrast is the point.

Start with the pre-work from #1. Expand each input to see the domain.

Workload inventory: ~15 workflows

Requisition management: open, draft, edit job reqs
Sourcing: advertise across job boards, SMS reactivation of past applicants, referrals
Screening: review applications, knockout questions, ranking
Pipeline management: advance or reject candidates through stages, single and bulk
Interview scheduling
Background checks
Offer letters
Pipeline analytics

The vocabularies differ sharply across clusters: acquisition, interview logistics, legal compliance.

Tool surface: 6 tools, 3 of them Type 2

ATS database: R/W, candidate PII, bulk-write with high blast radius
Job-board / advertising platform: write, cost-controlled, externally managed (Type 2)
SMS provider: write, TCPA-regulated
Calendar: R/W, event lifecycles (Type 2)
Background-check vendor: write, FCRA-regulated, legal weight (Type 2)
Analytics service: read-only

User patterns: recruiter, multi-step, bulk

A recruiter managing many open roles at once
Long sessions, multi-step workflows
Frequent bulk actions
Compound utterances ("advance these three to onsite, schedule them with the platform team, but skip anyone already in another loop")
Mid-conversation pivots are common

Constraints: FCRA, TCPA, EEOC, cost, PII

FCRA: background-check disclosure and adverse-action procedure
TCPA: SMS consent and opt-out
EEOC and anti-discrimination law: disposition decisions must be defensible and auditable
Cost: advertising-spend controls
PII: candidate data privacy

Now run the same five limits from Part 1 that the single-agent flow checks. This time the evidence is here.

Limit	Fires?	Evidence
Permissions / policy divergence	Yes, strongly	FCRA background checks and offer letters (write, legal consequence), TCPA-regulated SMS, bulk ATS mutation, and read-only analytics are genuinely different permission and regulatory surfaces. This one justifies the split on its own.
Instruction-set bloat	Yes	15+ workflows with sharply different vocabularies (sourcing channels, interview logistics, compliance language) strain a single coherent prompt.
Context saturation	Moderate	A popular role draws a large candidate pool; reasoning over many candidate records for a bulk action loads heavy context.
Parallel work	Supporting	One agent could issue concurrent calls, but these tasks also span independent, long-running, high-context tool surfaces that benefit from isolated ownership and recovery.
Trace legibility	Needed	Bulk disposition affecting many candidates needs explicit, auditable routing, especially under anti-discrimination scrutiny.

Multiple limits fire, and capability divergence alone is a decisive architectural signal: read-only analytics and an FCRA-regulated background-check workflow benefit from isolated contexts, least-privileged tool sets, and a clean audit boundary. The tools and APIs must still enforce authorization; separate prompts alone would not make the split safe. The framework escalates off single-agent.

Topology: coordinator, not router

The split is real, so the next question is the shape. State ownership is the dividing line from earlier in this post, and three traits of this workload each demand it:

Pause and resume. A recruiter starts a bulk advance, pauses to inspect one candidate, then resumes. That needs a pause/resume stack.
Compound utterances. "Advance these, schedule them, skip conflicts" is one message with several ordered sub-intents, which is Q2 decomposition.
Transition gates across specialists. No offer goes out before screening is complete and the background check has cleared, a precondition that spans specialists.

All three are coordinator territory. A flat router would lose the workflow state the moment the recruiter pivoted.

Specialist boundaries

The cut lines come from the axes in #5, permissions first.

Specialist	Owns	Why it is its own boundary
Coordinator	Q1 state, Q2 decomposition, bulk-action confirmations, transition-gate policy, cross-tool consequences	Orchestration, not domain work. The stateful burden lives here.
Hiring Gate	Background checks, offer letters	Axis 1, the strongest: FCRA and legal weight isolate this surface. Axis 4: it is the cleanest audit boundary.
Sourcing	Job advertising, SMS reactivation, referrals	Axis 1: the TCPA-regulated SMS surface is distinct. Axis 2: acquisition is one coherent tool surface.
Pipeline & Scheduling	Disposition (single and bulk), interview scheduling	Axis 2: advancing a candidate and booking the interview are one tightly coupled tool surface.
Requisition	Create, draft, edit, look up job reqs	Axis 2: a coherent tool surface, the ATS requisition tables.
Analytics	Reporting, pipeline metrics	Borderline. Read-only, so lean toward a coordinator-called tool unless it must reason about which metric answers the question, else it is the tool-equivalent-specialist anti-pattern.

The shape that falls out is one coordinator owning state and orchestration, over five specialists clustered by the axes above, permissions first.

The ATS recruiting decomposition: a stateful coordinator over five specialists, cut along permission and tool-surface boundaries. Tap a node for the boundary rationale.

Concerns and tools

Cross-cutting concerns land across three of the four placement patterns:

Embedded guidance: the Hiring Gate explains the FCRA workflow; Sourcing explains TCPA-sensitive behavior. The prompts guide reasoning, while the background-check and messaging tools enforce authorization, consent, and allowed actions.
Deterministic coordinator policy: bulk-action confirmations and transition gates run in the orchestration harness before high-impact tools execute.
Auditor, deferred: EEOC adverse-action correlation. Stand it up only when an immutable, cross-specialist audit trail becomes a hard requirement, not before.

On tools, the advertising platform, the calendar, and the background-check vendor are Type 2: each owns long-lived external state behind a clean tool surface. That is exactly why "close a requisition, then take down all of its job postings" belongs at the coordinator. It is a consequence that spans two Type 2 platforms, and no single specialist should own it.

Write this up and you already have most of a Decomposition ADR. The next section gives it a shape.

09THE OUTPUT

The output: a Decomposition ADR

The methodology produces an artifact, not a vibe: a Decomposition ADR you can review, challenge, and revise as the system evolves. The point is not to be right on day one; it is to make the decisions explicit so future revisions can challenge them. A defensible ADR contains eleven things:

1 · Workload inventory

Every workflow with frequency and criticality.

2 · Tool surface

Each tool with permission profile, regulatory weight, blast radius, and Type 1 / Type 2 classification.

3 · User-pattern summary

Interaction shape, session length, compound-intent likelihood, pause/resume burden.

4 · Constraints

Compliance, cost, latency, audit.

5 · Limit evaluation

Which of the Part 1 limits fire, with evidence.

6 · Topology choice

Single agent / router / coordinator+specialists / other, with alternatives considered and rejected.

7 · Specialist boundaries

Each specialist with owned workflows, tools, and the axes that justify the boundary.

8 · Coordinator responsibilities

If applicable: state model, gates, sequencing rules.

9 · Cross-cutting concern placement

Each concern with placement and rationale.

10 · Migration triggers

Conditions under which placements should be revisited.

11 · Open questions

Design judgments that need empirical validation.

And the failure modes the ADR is meant to prevent, the anti-patterns that show up again and again:

Anti-pattern	What it looks like	Why it is wrong
Splitting on persona alone	Two specialists, identical tools, different tone	Persona alone never justifies a split
Specialist proliferation	One specialist per workflow in the inventory	Specialists are clusters, not mirrors
Coordinator as god object	Business logic and content generation inside the coordinator	The coordinator owns orchestration, not domain work
Hidden state in specialists	A specialist mirrors an external system as a state machine	That state belongs in a Type 2 tool
Flat router for stateful workload	Router chosen "because simpler," then state bolted on	Router and coordinator are different shapes, not effort levels
Skipping the input phase	Jumping to topology before inventory and constraints	Pre-work is non-optional

The decision in one breath

Inputs before topology: workload, tools, users, constraints, concerns, failures.
Single agent until a named limit fires; capability divergence is the strongest architectural signal, with enforcement at the tool or API boundary.
Coordinator when one component must own state, decomposition, sequencing, or gates across turns or specialists.
Specialists are clusters that reason; tools are interfaces. Type 2 tools own external lifecycle, not specialists.
Write the ADR. Document migration triggers. Revise on evidence.

References

OpenAI, A Practical Guide to Building Agents
Anthropic, Building Effective Agents
Anthropic, How we built Claude Code auto mode: a safer way to skip permissions
OpenAI Agents SDK, Agents as tools
Claude Agent SDK, Subagents