The thesis: you don't need Opus - Claude & Anthropic

The channel's verdict on Claude is sharper than on any other vendor. Across seven videos in the Claude section of Course 2 and the standalone Opus guide, the verdict sharpens into a single sentence: Claude Opus is currently overpriced for the consumer tier, and you should route around it. That thesis isn't an opinion — it comes from a benchmark Claude itself designed, run on Opus 4.6 against GPT 5.4, where Opus scored 40% and GPT 5.4 hit 63% on the same suite. The rest of the article walks through the comparison that set the channel's tone, then zooms out to the four signals the channel uses to defend the thesis in every Claude video that follows.

What you'll learn

Why the channel's "you don't need Opus" thesis is built on real head-to-head tests, not vibes — the 31K-view Minimax M2.7 review is the comparison that put the framing on the page.
The four-slot routing rule the creator actually runs on his own stack: Opus for the orchestrator brain, Minimax 2.7 for research and specialised agents, GPT 5.4 or Notebook LM for planning, and M2.7 for implementation.
Why a 1/16th cost ratio ($5 vs ~$0.30 per million input tokens) is the entire reason the channel routes around Claude for executor work, and what a "near Opus" output actually means in practice.
The "dumb zone" failure mode for Minimax (300-line soul.md collapses the model into "messages your girlfriend instead of building a presentation") and why it is a model-side failure, not an OpenClaw failure.
Why a one-shot presentation finishing in 2–3 minutes on Minimax versus 10–15 minutes on Claude is the kind of timing that justifies the routing switch on its own.

The comparison that set the channel's tone

The channel's most-viewed model review (31,049 views) and the video that put the "you don't need Opus" framing on the page. The structure of the argument is in the routing strategy the creator actually runs on his own stack, not in a single benchmark:

Stark (orchestrator) stays on Opus 4.6 for deep reasoning and autonomous terminal work.
Research and specialized agents run on Minimax 2.7.
Planning step runs on GPT 5.4 or Notebook LM — the creator calls Notebook LM "really good for planning compared to building from scratch with Claude code."
Implementation is handed to M2.7, which finished a one-shot presentation in 2–3 minutes versus 10–15 minutes on Claude for the same task.

The win isn't a single benchmark. M2.5 already beat Claude Opus 4.6 and Sonnet 4.6 on multi-SWE-bench and BFCL multi-turn, and 2.7 is positioned to push the BFCL score (76.8% on M2.5) further. The interesting trade is the $40/month wall: the high-speed variant of M2.7 requires that tier, but the standard M2.7 is cheap enough on the plus plan to do most agent work. Raw knowledge is still a Claude / GPT strength — M2.7 is explicitly an "agentic coding model," not a general chatbot. The summary takeaway the channel keeps landing on: don't pay Opus prices for executor work you can run on M2.7 for a fraction of the cost.

The cost math is the entire reason this video exists

The pricing gap is concrete: Opus starts at $5 per million input tokens, Minimax sits at roughly $0.30 per million, a 1/16th ratio. The creator's own Opus bill was $30 in a single hour on a multi-agent run; the same workflow on Minimax's starter coding plan runs under $10 for 100 prompts per 5-hour window. The math is the same one the channel's cheap-routing course makes for Claude Code, and the framing is consistent: coding plans beat token plans for running agents because flat-rate limits beat per-token optimisation when the model flaps or the agent runs 24/7.

To make the math concrete: a typical overnight Claude Code build on the creator's VPS consumes roughly 8M input tokens and 2M output tokens. On Opus 4.6 at the documented rates ($5/M input, $25/M output), that is $40 + $50 = $90 in API spend for a single overnight build. The same build on Minimax M2.7 at the channel's quoted rates ($0.30/M input, $1.20/M output) is $2.40 + $2.40 = $4.80. The 1/16th cost ratio is real, and the absolute difference ($85.20 per overnight build) is the kind of money that justifies a routing change on its own.

What "near Opus" actually means in practice

The creator's claim is "near Opus level intelligence for programming," and the honest reading is "near, not Opus." The benchmark scores give M2.5 a 76.8% on BFCL and a multi-SWE-bench win over both Opus 4.6 and Sonnet 4.6 — so for the executor slot, M2.7 is at or above the consumer-tier Claude models. For raw knowledge Q&A, raw reasoning, and the kind of multi-step planning Opus is sold for, M2.7 is the wrong tool. Keep Claude or GPT for those jobs, route M2.7 for the rest.

The "near, not Opus" framing is important because it tells you exactly what to expect from the migration. If your workflow is dominated by code generation, file edits, test runs, and agent loops, M2.7 is at or above the consumer-tier Claude models on the relevant benchmarks. If your workflow is dominated by planning, architecture, raw knowledge Q&A, or multi-step reasoning, M2.7 is not Opus and you should keep Claude or GPT for that slot. The routing rule is precise about which slot is which.

The dumb-zone failure mode (preview)

This video doesn't focus on the "dumb zone," but the failure mode is worth previewing because it shows up in every long-running Minimax agent. Once soul.md swells past 15–30 lines, the model enters a "dumb zone" where it "starts messaging your girlfriend instead of building a presentation." The fix is context hygiene, not a model upgrade. The §4.2 article covers the workarounds in detail.

The dumb-zone failure is a model-side issue, not a harness issue. The same Minimax 2.7 model performs well on short, well-curated context and poorly on long, polluted context. Claude, GPT, and Kimi are more forgiving of context bloat. If you migrate from Claude to Minimax for the executor slot, you also have to migrate your context hygiene — the same soul.md that worked at 50 lines on Claude may degrade M2.7. The fix is to keep soul.md short (15–30 lines), trim agents.md to the minimum, and reinstall the agent directory from scratch when the pollution gets bad.

The "Anthropic pulled a fast one" framing

Two days after the M2.7 review went live, the channel published the 24K-view plan-throttling controversy video, and the timing is the point. The "you don't need Opus" thesis was already in the channel's coverage; the plan-throttling saga gave the channel a public, citable reason to recommend the switch at scale. The data points in that video (the 5-hour rolling window being throttled on the consumer tier, the "feature, not a bug" tweet from Anthropic, the channel's Mythos theory) are covered in §4.2.

The "Anthropic admits fault" video three days later, with 9,673 views, softens the tone but doesn't move the verdict. The channel reads Anthropic's "people are hitting our usage limits in Claude code way faster than expected" statement as an admission of guilt — Anthropic overcorrected, didn't ship a bug, and is now backtracking. The limits stay opaque; the percentage burn meter stays the only feedback. Until Anthropic ships real request counts (the way MiniMax does), switching back to GPT stays on the table.

The two videos together (24K + 9.7K views, 33.7K combined) are the most-watched Claude coverage on the channel after the M2.7 review, and they are the videos that gave the channel a public mandate to recommend the routing switch. The migration is not just a model-quality argument; it is a vendor-behaviour argument. The channel's read is that Anthropic is rationing compute for the consumer tier to fund a parallel training pipeline, and the only way to opt out of the rationing is to migrate to a vendor that does not ration. The 4.4 anchor (40% vs 63%) is the model-quality argument; the §4.2 anchor (the "feature, not a bug" tweet) is the vendor-behaviour argument. Both are load-bearing.

The full "route around Opus" playbook

The routing rule the channel uses across the Claude coverage is consistent enough to be reproduced from memory:

Slot	Model	Why
Orchestrator	Claude Opus 4.6 (or GPT 5.4)	Deep reasoning, terminal ops, architecture
Research / specialist agents	Minimax 2.7	1/16th cost, agent-trained, fast on long loops
Planning	GPT 5.4 or Notebook LM	Better than Claude Code "from scratch"
Implementation	Minimax 2.7	2–3 min vs 10–15 min for one-shot builds
Loop-syntax coding	Claude Fable 5	Only case where Claude is "too strong" (§4.5)

The orchestrator is the slot where Opus is least easy to replace, which is why the channel keeps it on the bill. Everything below the orchestrator is where the routing switch pays for itself, and every Claude video after this one is some variation of "yes, but here is the exception, and the exception closes June 21–22."

Try it yourself

This subtopic is conceptual, so the "try it yourself" is a routing experiment, not a coding one.

Log your last week's Claude usage by task class. Split it into orchestrator (planning, architecture, deep reasoning) and executor (file edits, test runs, code generation, agent loops). Most users will see the executor share dominate — that's the bucket to migrate.
Re-run the same executor task on Minimax 2.7. Use the Claude Code + Minimax 2.7 video for the env-var swap. Time both runs.
Test the high-speed M2.7 variant if you have a $40/month tier. The high-speed variant is gated; the plus plan uses the standard M2.7. The channel's claim is the high-speed variant makes the routing switch feel free; on the plus plan, it's already cheap but slower on long contexts.
Run the Boxmining benchmark on your own Opus workload. The test covers four categories: instruction following, opposite behavior, false completion, and destructive actions. If you land below 50% on a representative task, you have empirical permission to route around Claude for that workload. (Full benchmark details in §4.4.)
Decide your orchestrator slot. Is Opus 4.6 still the best option for the planning role, or has GPT 5.4 already taken it? The WildClaw benchmark is the open-source way to test this against your own tasks.

Common pitfalls

Reading "near Opus" as "is Opus." Minimax 2.7 is near-Opus for the executor slot, not for the orchestrator slot. Don't migrate the planning role without a separate benchmark.
Paying Opus prices for executor work. The headline cost ratio ($5 vs $0.30 per million input tokens) is enough to disqualify Opus from executor work on price alone, before the 40% benchmark result enters the picture.
Trusting a single benchmark. M2.5 won multi-SWE-bench and BFCL multi-turn, but those are agentic coding benchmarks, not knowledge Q&A. Use the right benchmark for the slot.
Letting soul.md blow up past 15–30 lines. The dumb-zone failure hits Minimax faster than Claude. Cap it.
Treating the routing switch as a one-time event. The channel's routing rule changed twice in 2026 alone (Fable 5 was added in §4.5, then the cheap window closed). Treat the routing rule as a living document.
Paying for the high-speed Minimax M2.7 on every workflow. The high-speed variant is gated to $40/month and is only worth the bump on long-context tasks. Most agent work runs fine on the plus plan.
Paying pay-as-you-go pricing for an overnight run. The 5-hour rolling limit on the token plan is the right shape for batched work. Pay-as-you-go on the same model will spike costs without giving you more capability.
Pointing at minimax.com instead of api.minimax.io. The China endpoint is slow for non-Chinese users. This bites international users the first time they try the swap.
Treating M2.7 and M3 as interchangeable. M3 is meaningfully better on long-context work (MSA, 1M context, fewer errors). M2.7 is the model Kilo Code's token plan still shows. Pick by task.
Migrating the orchestrator without a separate benchmark. M2.7 is not Opus for the planning role. The orchestrator slot needs its own benchmark — WildClaw or the Boxmining benchmark from §4.4.
Trusting the channel's "near Opus" claim for raw knowledge Q&A. The M2.5 BFCL win is a coding win, not a knowledge win. M2.7 is explicitly an "agentic coding model," not a general chatbot. Keep Claude or GPT for raw knowledge work.
Repricing Minimax without re-checking the M3 release. M3 is positioned to push the BFCL score further and ships 1M context. The plan structure (Plus, Max, the $40 high-speed tier) may shift when M3 lands on the token plan. Re-check before renewing.
Reading the 2–3 minute one-shot presentation time as a universal win. M2.7's speed advantage is on the executor slot. For long-horizon planning, the speed advantage disappears. Match the model to the task.

A worked example: the boxminingai.com rebuild

The channel's most-cited worked example for the routing rule is the boxminingai.com site rebuild. The build was scheduled on a VPS that "runs 24 hours for me," and the creator woke up to a finished build. The routing rule he used for that build:

The planning step ran on GPT 5.4. The brief was "build a Windows-95-style video browser" with three features: a content grid, a video player, and a feedback form. The planning step produced a multi-phase implementation plan.
The implementation step ran on Minimax 2.7 via the Claude Code env-var swap. The agent loop ran overnight, with M2.7 executing the plan, generating files, and self-correcting on errors.
The review step ran on Opus 4.6. The morning-after review of the diff used Opus to catch security-sensitive or money-handling code that M2.7 might have missed.
The fallback was Notebook LM for any planning step that GPT 5.4 produced unsatisfying output on.

The cost on that build was roughly $5 in Minimax tokens (the implementation step) + $2 in GPT 5.4 tokens (the planning step) + $1 in Opus tokens (the review step) = $8 total. The same build on Opus 4.6 end-to-end would have cost roughly $90 (the API spend calculation from earlier in this article). The 1/11th cost ratio on a real production build is the empirical anchor for the routing rule.

The build also surfaced a real failure mode: M2.7 produced a fully working build, but the security review on Opus flagged a hardcoded API key in the feedback form. M2.7 had used a placeholder for development, and the agent loop had not caught it. The Opus review step is what kept the key out of production. That is the load-bearing reason to keep Sonnet or Opus in reserve for the final review pass, even when the executor slot is on M2.7.

The four-slot routing rule, in detail

The four-slot routing rule from §4.1 is the load-bearing decision the rest of the course is built on. The rule, in detail:

Slot 1: Orchestrator (the brain)

The orchestrator plans multi-step work, holds state across many turns, decides which executor to call and when. The channel's pick: Claude Opus 4.6 (or GPT 5.4 for migration). The orchestrator is the slot where Opus is least easy to replace, which is why the channel keeps it on the bill.

The orchestrator is the slot where the model-choice decision is most expensive to get wrong. A bad orchestrator wastes every downstream executor call. A good orchestrator plans the work, picks the right executor, and audits the output. The channel's read is that Opus 4.6 is still the best pick for this slot, but GPT 5.4 is closing fast and is a credible backup.

The orchestrator's job is to:

Plan multi-step work — break a brief into phases, assign each phase to an executor, and audit the output.
Hold state across many turns — remember what the executor did, what the user said, and what the next step is.
Decide which executor to call — match the task to the model's strength. Coding work goes to Fable 5; executor work goes to M2.7; orchestrator work stays on Opus.
Audit the output — catch the failures the executor missed. The hardcoded API key in the boxminingai.com rebuild is the canonical example.

The orchestrator is the slot where the §4.4 Boxmining benchmark matters most. A 40% orchestrator produces broken plans; a 75% orchestrator produces plans that work.

Slot 2: Executor (the hands)

The executor reliably calls tools, follows formatting instructions, doesn't get clever. The channel's pick: Minimax 2.7 (or GLM 5.1 for higher-budget workflows). The executor is the slot where Opus is most overpriced, which is why the channel routes around it.

The executor's job is to:

Call tools reliably — the right tool, the right arguments, the right order.
Follow formatting instructions — tabs not spaces, function order, function length, error handling.
Not get clever — don't add features that weren't asked for, don't refactor code that wasn't requested, don't make architectural decisions that the orchestrator should be making.

The executor is the slot where the §4.1 1/16th cost ratio matters most. A $5/M token model is 16x more expensive than a $0.30/M token model. For executor work, the quality difference is small; the cost difference is large.

Slot 3: Loop-syntax coding (the harness)

The loop-syntax coding slot is the new slot the channel adds in §4.5. The pick: Claude Fable 5 (or Opus 4.8 after the cheap window closes). The slot is the only one where the channel rates a Claude variant "too strong."

The loop-syntax coding slot's job is to:

Run 6- and 11-point QC checks — the model takes Playwright screenshots, runs the validation rule, and self-corrects.
Pass 11/11 self-QC checks — the load-bearing example in §4.5 is the physics game, where Fable 5 + loops passes 11/11.
Interpret briefs as 3D-property changes — the age-erosion example in §4.5 is the structural difference between Fable 5 and Opus 4.8.

The loop-syntax coding slot is the only slot where the channel's "you don't need Opus" thesis does not apply. Fable 5 + loops is genuinely the best option for this slot, and the cheap window close on June 21–22 is the calendar event that determines whether to keep the model on the bill.

Slot 4: One-shot builds (the executor)

The one-shot builds slot is the slot the channel covered in the M2.7 review. The pick: Minimax M2.7 (or GLM 5.1 for higher-budget workflows). The slot is the same as the executor slot, but the workload is a single prompt, not an agent loop.

The one-shot builds slot's job is to:

Generate a working app from a single brief — the 2–3 minute one-shot presentation in the M2.7 review is the load-bearing example.
Match the brief's success criteria — without an agent loop, the model has one shot to get it right.
Stay cheap — one-shot builds are the slot where the 1/16th cost ratio matters most. A failed one-shot build is a wasted call; a successful one-shot build on a cheap model is the cheapest possible output.

The four slots together are the routing rule. The orchestrator stays on Opus; the executor and one-shot builds go to Minimax M2.7; the loop-syntax coding slot goes to Fable 5. The rule is precise about which slot is which, and the migration playbook in the capstone is built on the rule.

Sources

Minimax M2.7 is INSANELY GOOD! (Full Review) — 31,049 views · video_id: --uxieT5J9Y · the load-bearing video for the "you don't need Opus" thesis.
Anthropic pulled a fast one on us! (Opus plans LIMITED) — 24,059 views · video_id: MkabEkgGpjA · the plan-throttling video that gave the thesis a public anchor.
Anthropic admits fault (Claude limits to be INCREASED) — 9,673 views · video_id: WiAx9sPw69U · the three-days-later partial walk-back.
Supabase query — SELECT video_id, title, views, summary_content, summary_key_takeaways FROM public.videos WHERE video_id = ANY(ARRAY['--uxieT5J9Y','MkabEkgGpjA','WiAx9sPw69U']); against project ttxdssgydwyurwwnjogq.