📝 Written ● Advanced Updated 2026-05-13

Multi-agent swarms: when to use, when to skip

Spawning sub-agents multiplies both capability and cost. The naive default — "one big agent loop that does everything sequentially" — is right surprisingly often. The cases where a swarm wins are specific: independent parallel work, adversarial review, exploration with multiple angles. This is the decision rule and the patterns that hold up.

LingCode supports both. A single agent runs a long conversation, accumulates context as it goes, and tackles tasks step by step — what most people mean when they say "Claude wrote the code." A swarm spawns one or more sub-agents, each with their own clean context and their own scoped task, run in parallel or in sequence, with results integrated by the parent. The marketing story is "more agents = better results"; the engineering reality is "more agents = more tokens, more synthesis work, more places to lose the thread." Both shapes have their place. Picking right makes the difference between getting work done faster and getting an expensive distraction.

The mental model that holds up: agents share nothing except what you explicitly pass. Each sub-agent starts fresh — no memory of the parent's earlier work, no shared scratchpad. The parent agent passes a prompt; the sub-agent does the task; the parent reads the result. That isolation is the swarm's power (parallel work without context contamination) and its cost (you can't ask a sub-agent to "continue what we were doing" — you have to re-explain). Tasks where the right move requires the full conversation history are exactly the tasks where a swarm hurts. Tasks where each sub-task is independent and well-scoped are exactly where a swarm helps.

This tutorial covers the decision rule, the four patterns that work, the three that don't, and the specific LingCode features (Explore, Plan, parallel Task spawning) that operationalize them. We'll use code-review and refactor scenarios as the running examples because that's where the contrast is sharpest.

What you'll learn

What sub-agents share and what they don't (the isolation rule)
Four patterns where swarms beat a single agent
Three anti-patterns where swarms underperform
Concrete cost: how token spend scales with agent count
The Explore vs Plan vs Task split LingCode already exposes
How to phrase a parent prompt so sub-agents don't drift
The synthesis problem and how to design around it

Step 1: The one-rule framework

"Can the sub-tasks be done without knowing about each other?"

If yes → swarm is a candidate. If no → single agent.

Examples that pass the test (sub-agents work fine):

"Audit each of these 5 microservices for security issues." Each audit is independent; results combine at the end.
"Read these 12 files and summarize what each does." Per-file summaries don't need each other.
"Try three different architectures for this feature, return the best one." Independent attempts with a final pick.
"Critique this PR (one agent), independently of the agent that wrote it." Adversarial pairing.

Examples that fail the test (single agent wins):

"Refactor this codebase to use TypeScript strict mode." Every file's change depends on the patterns established in the earlier files.
"Debug this failing test." The debugging is a chain of inferences; each step builds on the last.
"Write a tutorial about X." The first paragraph constrains the second; sequential thinking is the work.

When you find yourself thinking "I'll spawn one agent to do the planning, then another to do the work" — pause. The planner's context is exactly what the worker needs; spawning a separate worker means re-feeding all of it. A single agent with a "plan-then-act" prompt usually beats two sequential agents because no context is lost in handoff.

Step 2: The four patterns that work

Each one wins for a specific reason

Pattern 1: Parallel independent work. N tasks, none depend on each other, results combine at the end. The canonical case: "audit these 5 services in parallel." Five sub-agents run simultaneously; total wall-clock time ≈ longest single audit, not sum of all audits. Token cost scales linearly with N but you're paying for actual work, not coordination.

Pattern 2: Exploration with multiple angles. "Try three implementation approaches for this feature." Each sub-agent attempts one approach in isolation. The parent reads the three results and picks (or synthesizes). Wins when you'd otherwise commit to one approach prematurely; the parallel exploration catches the case where approach #2 was clearly better but you'd never have written it.

Pattern 3: Adversarial review. One agent produces output; a second agent — with a clean context — reviews it. The reviewer's lack of context is the feature: they catch things the author rationalized. The classic version is "writer agent + editor agent" but it works for code (proposer + critic) and design (designer + skeptic). Two-agent minimum; more reviewers don't usually help.

Pattern 4: Long-horizon plans with isolated stages. Some workflows have stages that legitimately don't need each other's working state — "research this topic" then "draft a doc using the research" then "format and lint the doc." Each stage's output is the next stage's input; intermediate scratch doesn't need to survive. A sequence of sub-agents (each fed only the prior's final output) keeps context budgets low and lets each stage focus.

Step 3: The three anti-patterns

Cases where spawning sub-agents costs more than it earns

Anti-pattern 1: Spawning a sub-agent for a small task. If the task fits in 5–10 sentences of prompt and would take the parent under a minute, spawning a sub-agent adds setup overhead (re-explaining context, re-loading any tools the sub-agent needs) that exceeds the actual work. Threshold: only spawn for tasks that need their own dedicated context window or take genuine minutes of compute.

Anti-pattern 2: The orchestrator-for-its-own-sake. Spawning a "manager" sub-agent whose only job is to spawn other sub-agents. The manager has no skill the parent lacks; it just adds a token-spending hop. If the parent can decide which sub-agents to spawn, the parent should spawn them directly.

Anti-pattern 3: Chained sub-agents that should be one conversation. "Agent A reads the file, Agent B analyzes what A read, Agent C writes the fix." If A's understanding of the file is what B and C need, the chain is paying re-explanation cost at every handoff. A single agent reading the file and then doing A→B→C in conversation wins. The rule of thumb: if the natural data flow between stages is everything the prior stage knew, the stages should be one agent.

Token cost grows roughly N×. Three sub-agents on the same task means roughly three times the token spend of a single agent doing it sequentially — sometimes more, because each sub-agent re-reads context the parent already had. Run the cost math before scaling: at $3/M input tokens for Claude Sonnet, a 3-agent swarm on a 500K-token codebase costs ~$4.50 per run before output tokens. Worth it for genuine exploration; expensive for "I had agents do it" theater.

Step 4: Phrasing the parent prompt

Make the sub-agent's job crisp; don't make it figure out its own scope

Sub-agents have no memory of the parent's reasoning. Whatever the parent passes is all they get. Two common mistakes in parent prompts:

Underspecification. "Audit this service" lets the sub-agent decide what to look at — security, performance, architecture, naming. Five sub-agents on five services may all interpret "audit" differently, producing reports that don't compare. Pin the scope: "Audit this service for SQL-injection patterns specifically. Look for unsanitized inputs reaching the DB layer; report each finding with file:line and a suggested fix."
Over-explanation. Dumping the parent's full context into the sub-agent prompt defeats the swarm's purpose. The sub-agent now has the same context bloat the parent had. Pass only what the sub-agent specifically needs.

A good parent prompt to a sub-agent reads like a self-contained bug report: enough context to do the work, no narrative about why the parent decided to delegate.

Step 5: LingCode's primitives

Explore, Plan, parallel Tasks, custom subagents

LingCode exposes the patterns above as specific tools:

Explore subagent — fast read-only search for "find / count / locate" work. Cheap, isolated context, returns excerpts. Use when the answer is "which files mention X" or "what's the structure of Y." Don't use for code editing — it can't write.
Plan subagent — designs an implementation strategy for a non-trivial task. Returns step-by-step plans with critical-file identification. Use as the "plan stage" of Pattern 4 (long-horizon).
General-purpose subagent — full agent loop with all tools. Use for genuinely parallel work: spawn N of these for N independent tasks.
Custom subagents via .claude/agents/ — define your own with constrained tool sets. Useful for repeated specialized tasks ("the SQL-injection auditor"). See Write a subagent.

The pattern matrix:

Parallel independent (Pattern 1) → multiple Explore or general-purpose agents in a single message.
Exploration with angles (Pattern 2) → multiple general-purpose agents, each with a different approach in the prompt.
Adversarial review (Pattern 3) → main agent produces, a custom "reviewer" subagent critiques.
Long-horizon (Pattern 4) → sequential Explore → Plan → general-purpose chain.

Spawn in parallel, not in serial. If you're spawning 5 sub-agents for Pattern 1 (independent work), send them in a single message with 5 tool calls. LingCode runs them concurrently; total time is the longest single run. Spawning them one at a time serializes the work and 5× your wall-clock time for no reason.

Step 6: The synthesis problem

N agents return; now what?

The output of a swarm is N partial answers. Combining them is its own task — and the most common place a swarm setup degrades into worse-than-single-agent results. Three approaches:

Concatenate. If each sub-agent's output is independent (e.g., per-file summaries), just stack them in a final report. Cheapest; works when there's no overlap to reconcile.
Voting or top-pick. If sub-agents explored N alternatives (Pattern 2), the parent reads all N and picks the best (or merges the strongest parts). The parent's judgment is the synthesis.
A synthesis pass. Feed all N outputs back into the parent (or a dedicated synthesizer subagent) with a prompt like "Here are five independent reports on the same codebase. Identify findings that appear in ≥3 reports — those are the high-confidence issues. Identify unique findings that look material — those need a second look."

If synthesis is genuinely hard — the sub-agents produced overlapping, contradictory, hard-to-reconcile outputs — that's often a sign the task didn't actually pass the "independent sub-tasks" test from Step 1. Reconsider whether a single agent would have produced a cleaner answer.

Step 7: Worked examples

Three real LingCode tasks, scored

Task: "Add tests for these 8 modules that have no test coverage." Each module's test suite is independent; the modules don't share code under test. Pattern 1. Spawn 8 general-purpose subagents, one per module, in parallel. Wall-clock time ≈ longest single test suite. Cost ≈ 8× single agent on one module. Worth it.

Task: "Refactor this 50-file codebase from JavaScript to TypeScript." Every file's typing depends on imports / exports of other files; type definitions established early constrain later files. Single agent. A swarm would re-derive the same type model in each sub-agent and produce inconsistent type definitions across files. Sequential agent that holds the type map in context wins.

Task: "Investigate this production crash." Debugging is inferential; each new piece of evidence redirects the next step. Single agent. Spawning a "log reader" and a "stack tracer" and a "git blamer" produces three partial views that the parent has to reconcile — exactly the work the single agent would have done while reading the artifacts itself, with full context.

Task: "Write a marketing landing page; have one agent draft it and another critique it for clarity." Adversarial review benefits from clean context. Pattern 3. Author drafts; critic reviews without seeing the brainstorming; author revises based on the critique. Two agents, sequential, not parallel.

When in doubt, single agent

The most reliable default in 2026 is "one agent, long context, careful prompt." Modern Claude models handle 200K-token contexts well; modern tooling lets a single agent inspect 100s of files in one run. The cases where swarms add real value are specific and increasingly few — the model improvements that closed the "context too small" gap also closed most of the case for swarms. Use them when the task genuinely has independent parallel structure, or when adversarial isolation between author and reviewer is the point. Skip them when "more agents" is just decoration.