The first question every multi-agent user hits is the same one: why does my agent produce slop? The channel's framing across four videos is sharp: the agent is not stupid, it is context-overloaded. One agent running one big context window for one big task is exactly the wrong shape — it tries to remember everything, rushes to deliver, produces generic output, and hallucinates. Sub-agents fix all four. The fix is structural: a small main agent with a focused context plans and coordinates, and a fleet of disposable workers with minimal context each do one thing well. The "vibe-coded slop" problem is not a model problem; it is a context-architecture problem.

This article walks through the "vibe-coded slop" diagnosis, the sub-agent solution, and the audit rule that lets you tell whether a sub-agent actually fired. Two videos anchor it: the channel's clearest statement of the problem and fix (K-afyUDWEtY), and the short clip on whether a sub-agent did the work at all (g3adOMPsiiI, 262 views).

What you'll learn

  • Single-agent workflows fail in four specific ways — context overload, rushed execution, generic output, higher error rates — and all four are downstream of "one context, one task, one model" (13-sub-agents.md, "Why Sub-Agents Matter").
  • The sub-agent fix is a context-budget split: a 100K-token main agent (50K preferences + 30K project history + 20K current conversation) plans, while 15K-token sub-agents (5K instructions + 10K relevant data) execute. The cheap model stays in the smart zone because the per-worker context is small (13-sub-agents.md, "Context Window Optimization").
  • Sub-agents are not enabled by default in OpenClaw — you have to ask for them, and on cheap models you have to ask every day (13-sub-agents.md, "Enabling Sub-Agents" and "Model-Specific Considerations").
  • "I deployed a sub-agent" is unfalsifiable unless the worker has a named persona in the run log. The audit rule: treat unfalsifiable sub-agent claims as suspect; require names (g3adOMPsiiI, transcript).
  • Parallel execution is the time win: research agents fan out simultaneously, presentation agents wait on them, graphics agents wait on the presentation. Sequential is the default if you do not ask for parallel; you have to ask (13-sub-agents.md, "Best Practices").
  • Structured output from each sub-agent is what makes synthesis tractable — without it, the main agent gets a wall of prose and re-does the synthesis work the workers were supposed to do (13-sub-agents.md, "Advanced Patterns").

The diagnosis: why single-agent workflows produce slop

The channel's framing in this video is unambiguous: the slop is not a model problem, it is a context-architecture problem. When one agent handles everything end-to-end, four specific failure modes compound:

  • Context overload. The agent tries to remember too much. The 100K-token main-agent budget is split across your preferences (50K), project history (30K), and the current conversation (20K) — and the model is supposed to do good work with the leftover attention. It cannot. It loses the thread, drifts, and starts producing generic answers (13-sub-agents.md, "The Problem with Single-Agent Workflows").

  • Rushed execution. The pressure to deliver results quickly leads to shortcuts. A single agent under one big context is racing against its own context window — once it crosses ~40% context, it enters the "dumb zone" from Course 4 §4.5, and the model starts ignoring instructions, hallucinating details, and under-weighting the rules you actually care about (13-sub-agents.md, "Context Window Optimization").

  • Generic output. A lack of specialization produces boilerplate. The single agent has no "I am a researcher" or "I am a graphics designer" persona — it is one generalist trying to be many specialists, and the result reads like the lowest common denominator (13-sub-agents.md, "The Problem with Single-Agent Workflows").

  • Higher error rates. No cross-checking or validation. With one agent, an error compounds: a wrong assumption early on becomes a wrong answer late, and nothing in the loop catches it (13-sub-agents.md, "The Problem with Single-Agent Workflows").

The four failure modes share a single root cause: one model, one context, one task. The sub-agent solution shares a single fix: split the work into specialised workers with minimal context each, and let the main agent coordinate.

The sub-agent solution, summarised

The wins are five (13-sub-agents.md, "The Sub-Agent Solution"):

  • Massive improvement in quality — specialised agents produce focused, accurate work
  • Parallel execution — multiple tasks run simultaneously, saving time
  • Context optimisation — each sub-agent has minimal context, maximising intelligence
  • Built-in validation — multiple agents can cross-check each other's work
  • Reduced hallucination — if one agent hallucinates, others can correct it

The fifth is the one the channel leans on most. A research sub-agent that hallucinates a source URL is caught by a presentation sub-agent that tries to render the URL and finds it 404. A coding sub-agent that returns a syntactically valid but semantically wrong function is caught by a review sub-agent that runs the test suite. The pattern is structural, not a model upgrade.

The architecture: one brain, many hands

The diagram from the source material is worth reproducing in words (13-sub-agents.md, "Architecture"):

┌─────────────────────────────────────┐
│   Main Agent (Orchestrator)         │
│   - Knows everything about you      │
│   - Plans and coordinates           │
│   - Synthesises results             │
└──────────────┬──────────────────────┘
               │
       ┌───────┴───────┐
       │               │
┌──────▼──────┐ ┌─────▼───────┐
│ Sub-Agent 1 │ │ Sub-Agent 2 │
│ Research    │ │ Graphics    │
│ (Minimal    │ │ (Minimal    │
│  context)   │ │  context)   │
└─────────────┘ └─────────────┘

Five principles (13-sub-agents.md, "Key Principles"):

  1. The orchestrator knows you — main agent has full context about your preferences, project history, and current goal
  2. Sub-agents are specialised — each focuses on one specific task
  3. Minimal context per sub-agent — they don't need your life story
  4. Parallel execution — work happens simultaneously
  5. Results aggregation — main agent synthesises outputs

The first principle is the load-bearing one. The orchestrator is the only agent that holds your preferences, your project history, and the meta-context for the current task. Sub-agents receive a task brief — the relevant data, the relevant instructions, and the relevant format — and nothing else. That is the context budget that keeps them in the smart zone.

The context math: 100K for the brain, 15K per hand

The context-window math from the source is worth quoting in full (13-sub-agents.md, "Context Window Optimization"):

Main agent context:

  • Your preferences: 50K tokens
  • Project history: 30K tokens
  • Current conversation: 20K tokens
  • Total: 100K tokens

Sub-agent context:

  • Task instructions: 5K tokens
  • Relevant data only: 10K tokens
  • Total: 15K tokens per sub-agent

The split is not arbitrary. The 100K main agent is the expensive one — that is where you pay Opus prices if you route through Opus, and that is where the orchestrator's reasoning about the project as a whole lives. The 15K per sub-agent is the cheap one — that is where Minimax, Qwen, or DeepSeek can run cheaply because the context is small enough to keep the model in its high-attention regime.

Lower-end models (Minimax, Qwen) perform significantly better with smaller context. Sub-agents keep them in the "smart zone" (under 40% context), and the cost-effectiveness of cheap Chinese models is the entire reason the channel routes sub-agents through them in the first place (13-sub-agents.md, "Performance Impact"). The 40% threshold is the same one flagged in Course 4 §4.5 for the broader "dumb zone" problem.

Higher-end models (Opus, Sonnet) still benefit from specialisation — faster execution through parallelism, better quality through focused attention. The channel's split: Opus for the main orchestrator, Sonnet or Minimax for the workers.

Enabling sub-agents: the most-missed step

The single most-missed detail in the source material (13-sub-agents.md, "Default Behavior"):

Important: Sub-agents are NOT enabled by default. You must explicitly request them.

The default OpenClaw or Hermes install runs as a single agent. The sub-agent pattern only kicks in when the prompt asks for it. The two invocation patterns the channel uses:

Basic invocation — one line, no structure:

Use sub-agents to research this topic and create a presentation

Advanced invocation — explicit role split:

Can you make a presentation on [topic]?
- Send sub-agents to research why it's important
- Use another sub-agent to make the presentation
- Use other sub-agents to do the SVG graphics

The advanced form is the one the channel recommends for non-trivial work. The reason: the basic form leaves it up to the model to decide how to split the work, and on cheap models the model will often decide "sub-agents are not worth the orchestration cost" and run the whole thing in one context. The advanced form commits the model to the split.

For Minimax and Chinese models, the channel's advice is sharper: sub-agents are essential for good results, and the main agent should be reminded to use them. The model has a tendency to skip the sub-agent step under context pressure. The recommended fix is to add to the agent's memory or skills:

When handling complex tasks, always use parallel sub-agents 
to optimise context and improve quality.

Then "may need to remind agent daily to use sub-agents" — the channel has personally seen the model forget the rule and start serialising work without the reminder (13-sub-agents.md, "Model-Specific Considerations").

For Claude Opus, the model handles sub-agents more naturally — it understands when to spawn sub-agents automatically and is better at orchestrating complex multi-agent workflows. Even on Opus, however, the channel recommends explicit instructions for critical tasks and parallel execution for time savings (13-sub-agents.md, "Claude Opus"). The "natural" usage is not "always uses sub-agents" — it is "uses sub-agents when the prompt implies the work is multi-step."

The audit rule: name your sub-agents

This 262-view clip is the load-bearing caveat for the entire sub-agent pattern. A speaker pushes back on the "I deployed a sub-agent" claim: "How do I know that Stark himself is, you know, gaslighting again and saying, 'Oh, I deployed a sub agent.' But it turns out he's the one doing the work" (g3adOMPsiiI, transcript).

The technical point: "technically speaking, these sub agents are all running Opus 4.6 or whatever model that you want on the back end. So technically, they're not different people." You get one model with one context window; the sub-agent split is a prompt-structure trick, not extra compute ("You just structure it like that" — g3adOMPsiiI, transcript).

What does work is naming and personality. The clip's recommendation: "give some names to your sub agents and then we'll give them a little bit of personalities to there" (g3adOMPsiiI, transcript). Naming makes the activation visible in logs and transcripts, so you can tell which persona actually handled a turn. Without names, "subagent did it" is unfalsifiable.

The audit rule the clip implies: treat "my subagent did it" as a suspect claim unless you can see the named persona in the run log. If you want credibility on stream, name them; if you are auditing, demand the names match the activations. This is the same rule that anchors §7.5's naming-discipline requirement on the Kanban — the multi-board isolation only works if the profile names are project-scoped and visible.

The improvement that made sub-agents usable

Two improvements in the latest version of the sub-agent pattern are worth flagging (13-sub-agents.md, "Sub-Agent Improvements (Latest Version)"):

Error reporting. Old behaviour: sub-agent fails silently, no notification, main agent waits indefinitely. New behaviour: failure notifications delivered immediately, error details provided, can retry failed sub-agents. The change matters because silent sub-agent failures are the silent killer of multi-agent workflows — you wait for an hour, then discover the orchestrator has been hanging on a worker that died 50 minutes ago.

Status updates. Real-time updates stream as sub-agents fire: "Sub-agent 1 started: Researching documentation", "Sub-agent 2 started: Searching web", "Sub-agent 1 completed: Found 15 relevant sources", "Sub-agent 2 failed: Retrying with adjusted parameters". The status stream is what makes the workflow inspectable — you can see which worker is on which step without having to dig into logs.

The cross-reference to the Kanban in §7.3: the Kanban implements these improvements as a first-class UI surface. Worker Logs in the Kanban UI show the same status stream, plus a retry history that an OpenClaw sub-agent flow does not give you by default.

Cost considerations: more work, more tokens, but better results

Sub-agents cost more in raw token spend (13-sub-agents.md, "Token Usage"):

  • Each sub-agent consumes tokens
  • Parallel execution means simultaneous API calls
  • Total cost is higher than single-agent approach

The trade-off is real, but the channel's framing is that the quality gains more than pay for the quantity cost:

  • Each sub-agent uses fewer tokens (smaller context) — per-call cost is lower
  • Better results mean less rework — a single high-quality sub-agent run is cheaper than three redo-runs of a serial slop-producer
  • Time savings offset cost increase — parallel research + parallel writing completes in 15 minutes vs. 30 for serial

For Minimax users, the subscription includes a generous prompt allowance and sub-agents are "cost-effective within plan limits" (13-sub-agents.md, "Cost Optimisation"). For Opus users, the channel recommends monitoring usage for expensive workflows and using sub-agents for "high-value tasks" only — i.e. the tasks where the quality difference is large enough to justify the Opus bill.

The mixed-model strategy is the most cost-effective at scale: Opus for the main orchestrator, Sonnet or Minimax for the sub-agents. The split is configured at the sub-agent prompt level:

Use Opus for main orchestration, but spawn 
sub-agents using Sonnet for cost efficiency

The config is "an advanced topic" in the source material (13-sub-agents.md, "Mixed Model Strategy") — i.e. it requires hand-editing the agent's profile config or the per-skill routing — but the savings on a 10-sub-agent workflow are real. The Kanban in §7.3 makes this much easier: each profile is a separate agent with its own inference_provider and API key, so the routing is per-profile, not per-prompt.

Common pitfalls (preview of §7.5 and §7.4)

Three of the most common sub-agent pitfalls from the source material are worth previewing here because they recur in §7.4 and §7.5:

  • Not requesting sub-agents. Agent does everything itself, produces slop. The fix is explicit prompting and a memory pin (13-sub-agents.md, "Pitfall 1").
  • Too much context per sub-agent. Sub-agents receive full context, defeating the purpose. The fix is for the main agent to send only task-relevant information — the 15K budget, not the 100K main context (13-sub-agents.md, "Pitfall 2").
  • Sequential execution. Sub-agents run one after another, wasting time. The fix is to request parallel execution explicitly: "spawn all research agents in parallel, then synthesise" (13-sub-agents.md, "Pitfall 3").

The fourth pitfall — no result synthesis — is the one the Kanban in §7.3 solves structurally. The Kanban dashboard gives you a "done" lane that only flips on a successful artefact write, so "synthesis happened" is no longer a thing you have to remember to ask for.

Try it yourself

The hands-on goal: prove the sub-agent fix on a single non-trivial task, then audit the run to confirm the workers actually fired.

  1. Pick a task that has at least three sub-tasks. A presentation on a topic, a research summary, a code review, or a content piece. The three-sub-task minimum is what makes the parallel gain measurable.
  2. Run it once without the sub-agent prompt. Send the bare brief. Note the output: generic, slow, and missing the structural detail you wanted.
  3. Re-run with an explicit sub-agent split. Use the advanced invocation pattern:
    Make a presentation on [topic].
    - Sub-agent 1: research current state of the art
    - Sub-agent 2: identify three case studies
    - Sub-agent 3: draft the slide outline
    - Sub-agent 4: write speaker notes
    Synthesise the final presentation.
    
  4. Time both runs. If the parallel run is faster, you've reproduced the channel's result. If it's slower, the orchestration overhead is dominating and you have too many workers.
  5. Audit the run log. Look for the named personas in the log entries. If the names don't appear, the model serialised the work and labelled it "sub-agents" after the fact. That is the "gaslighting" failure mode from the g3adOMPsiiI clip — catch it on the first run, not the tenth.
  6. Inspect the per-worker context. Most agent harnesses expose a "context used" indicator per worker. Confirm the workers ran at ~15K context, not 100K. If a worker ran at 100K context, the orchestrator leaked your preferences into the worker brief and the smart-zone benefit is gone.
  7. Re-run with structured-output instructions. Add to the sub-agent prompts: "Return your findings in markdown with ## Key Findings, ## Notable Opinions, ## Source Links, ## Patterns Observed, ## Gaps Identified sections." Re-run and time the synthesis step. Structured output cuts the synthesis time by ~10x because the orchestrator is parsing, not reading.
  8. Add a validation sub-agent. Once the four-worker setup works, add a fifth: "Sub-agent 5: review the synthesised presentation for accuracy, completeness, and consistency. Report any issues." Compare the quality of the validated output vs. the unvalidated output. The validated output is what production-grade sub-agent workflows ship.

Common pitfalls

  • Not requesting sub-agents. The default OpenClaw / Hermes behaviour is single-agent. If you do not ask for sub-agents, the model runs the whole task in one context — the "vibe-coded slop" diagnosis applies in full.
  • Asking for sub-agents but not specifying parallel. The default on cheap models is sequential sub-agents — the workers run one after another. Always specify "in parallel" or "spawn all simultaneously."
  • Sending full context to sub-agents. The orchestrator should pass a task brief — the relevant data, the relevant instructions, the relevant format. If it forwards its own 100K context, the sub-agent runs at 100K context, crosses the 40% threshold on a 256K model, and the smart-zone benefit is gone.
  • Accepting "subagent did it" claims without audit. Treat any sub-agent claim as suspect unless the named persona appears in the run log. The g3adOMPsiiI clip is the rule: name them, log them, audit them.
  • Forcing structured output without providing the format. "Return your findings" is not a structured-output spec. "Return markdown with these five sections" is. The first produces prose; the second produces a parseable artefact.
  • Skipping synthesis. "The sub-agents did the work" is not a deliverable. The orchestrator must synthesise the worker outputs into a coherent final result. The Kanban in §7.3 enforces this with the "done" lane — the dashboard only flips to done on a successful artefact write.
  • Letting sub-agents share a context. Two sub-agents that share a thread are one agent with extra latency, not two workers. The whole point of sub-agents is isolation.
  • Using Opus for every sub-agent. Opus-for-everything is the most expensive configuration. The mixed-model strategy (Opus orchestrator + Sonnet/Minimax workers) gives you ~80% of the quality at ~30% of the cost.
  • Skipping error reporting. The latest sub-agent improvements ship failure notifications — if you do not see them, you are on an old build. Update before you scale.
  • Reading the cost warning as a reason not to use sub-agents. Yes, sub-agents consume more tokens. Yes, parallel execution means simultaneous API calls. The quality gains and time savings more than offset the cost on a non-trivial task. The cost warning is a "watch your meter" prompt, not a "don't use sub-agents" rule.

Sources

  • Sub-Agents stop slopvideo_id: K-afyUDWEtY · cited: single-agent failure modes, sub-agent solution, architecture diagram, context budget math, enabling sub-agents, model-specific considerations, error reporting, status updates, cost considerations
  • Is Your Subagent Actually Doing the Work — 262 views · video_id: g3adOMPsiiI · cited: "all running Opus 4.6" sub-agent framing, name-and-persona audit rule
  • Source MD/home/ubuntu/boxai3/docs/courses/_archive-2026-06-18/13-sub-agents.md (13-sub-agents.md, full). Every concrete claim in this article is sourced from the MD file: the 100K vs 15K context math, the four single-agent failure modes, the five sub-agent wins, the explicit-enablement warning, the Claude Opus / Minimax model split, the cost / time / quality trade-off, and the "Common Pitfalls" entries.