The capstone of this course is the decision rule for picking an auxiliary over a flagship, a free tier over a paid tier, and a high-volume "dumb" model over a smart-but-slow one. The argument is simple: in any agent workload, the right model is the one that passes your tasks for the lowest cost, not the one with the highest benchmark score. The channel's coverage of Mimo V2 Pro, the BYOK free-tier pattern, and the auxiliary slot in the tier list all point to the same conclusion — the "best" model is rarely the right pick for the executor slot, and "good enough" is the structural answer for the auxiliary slot.
This article walks through the cost / quality trade-off, the workflow class rule (which workloads should be on a free model and which shouldn't), the Mavis verifier pattern as the multiplier, and the migration story (what to do when the free tier ends and you have to pay). The article closes with the channel's overall recommendation: a $0/month agent stack is not a compromise — it's the right answer for most workloads.
What you'll learn
- The "good enough" decision rule: pick the model that passes your tasks for the lowest cost, not the one with the highest benchmark score.
- The cost / quality trade-off is not linear. A model with 55% reliability at $0/run can be more valuable than a model with 75% reliability at $50/run if the workload is high-volume and the misses are recoverable.
- The workflow class rule: orchestrator work needs a smart model (GPT 5.4 or Opus, paid); executor work can use a cheap model (Minimax M2.7, Mimo V2 Pro, often free); auxiliary work should use a free model (Gemini 2.5 Flash, Gemini 3 Flash, Mimo V2 Pro).
- The Mavis verifier pattern multiplies effective reliability. Workers on a free 55% model, verifier on a different free model, output only ships if both pass. Effective reliability is higher than any single model.
- The prompt caching mechanic is the multiplier on top. Cached reads are 5–10x cheaper than uncached input, so a long-running agent on a free model with caching is sustainable.
- The migration story: when the free period ends, you have a backup model. The transition is smoother if you've already tested the backup. Don't wait for the free period to end to discover the fallback doesn't work.
The cost / quality trade-off
The "good enough" decision rule starts with a counter-intuitive observation: a model with 55% reliability at $0/run can be more valuable than a model with 75% reliability at $50/run for the right workload. The math is straightforward:
- Model A: 55% reliability, $0/run, 500 min wall-clock. Cost per successful run: $0.
- Model B: 75% reliability, $50/run, 500 min wall-clock. Cost per successful run: $50 / 0.75 = $67.
If the workload is "process 10,000 documents," the cost comparison is:
- Model A: $0 total, 5,500 correct (55%). 4,500 need review.
- Model B: $500,000 total, 7,500 correct (75%). 2,500 need review.
At 10,000 documents, Model A is $500,000 cheaper and produces 2,000 fewer correct outputs. The cost of the 2,000 misses is the cost of review, which is the same in both cases. The cost savings are pure.
The math flips when the workload is "decide whether to approve a loan." A 55% miss rate on loan approvals is unacceptable. A 75% reliability is also unacceptable, but the absolute cost of a single bad decision is high. For high-stakes one-shot decisions, the higher-reliability model is the right pick. For high-volume recoverable workloads, the lower-cost model is the right pick.
The channel's coverage of the cost / quality trade-off in the Best Model for Openclaw (WildClaw Benchmarks!) video is the most explicit:
- Claude Opus 4.7: 51% success rate, $80/run, $157 per successful run
- GPT 5.4: ~65% success rate, ~$20/run, ~$31 per successful run
- Mimo V2 Pro: ~55% success rate, $26/run (when paid), $47 per successful run
- Minimax 2.7: ~45% success rate, ~$8/run, ~$18 per successful run
- Grok: ~40% success rate, ~$15/run, ~$38 per successful run (but 94 min vs ~500 min)
The bottom line: Minimax M2.7 has the best cost-per-success ratio, Mimo V2 Pro is close behind, Opus 4.7 is the worst by a factor of 3. The channel's recommendation: "use Opus only if you have an uncapped coding plan. Otherwise run WildClaw yourself against GPT 5.4, Mimo V2, or Grok and pick the one that passes your own tasks for the lowest $/run."
Three worked examples
The cost / quality trade-off lands differently depending on the workload. Three worked examples from the channel's coverage:
Worked example 1: Daily news report (the channel's own workflow).
The creator runs a daily news report that scrapes 14 sources, filters duplicates, ranks by importance, writes a structured markdown file, and posts to Discord. Old setup: 14 sequential web searches, single sub-agent, 10–15 minutes. New setup with Mavis / Hermes: 5 parallel search workers, 2 editors, 2 publishers — 9 workers in parallel, 3–4 minutes.
The model choice: Opus 4.6 for the planning step (orchestrator), Minimax M2.7 for the execution step (executor). The full breakdown:
- Orchestrator (Opus 4.6, 1 call): $0.05 per day (input/output combined)
- Executor workers (Minimax M2.7, 9 calls): $0.01 per day
- Verifier (Mavis, separate agent, 1 call): $0.001 per day
- Total: $0.06 per day, $1.80 per month
If the same workflow ran on Opus 4.6 for everything: 10 calls × $0.05 = $0.50 per day, $15 per month. The Minimax + Mavis pattern is 8x cheaper. The channel's framing: "this is the cheap model the orchestrator shouldn't be paying for — the worker is the right tool."
Worked example 2: High-volume document processing.
The channel's hypothetical: "process 10,000 PDFs and extract a structured field from each." The cost breakdown across the five models:
- Mimo V2 Pro (free during promo): $0 total, 5,500 correct, 4,500 need review.
- Mimo V2 Pro (paid, $26/run): $26 × 100 batches of 100 = $2,600 total, 5,500 correct.
- Minimax M2.7 ($8/run): $800 total, 4,500 correct, 5,500 need review.
- GPT 5.4 ($20/run): $2,000 total, 6,500 correct, 3,500 need review.
- Opus 4.7 ($80/run): $8,000 total, 5,100 correct, 4,900 need review.
The cost-per-correct field:
- Mimo (free): $0 / 5,500 = $0 per correct field
- Mimo (paid): $2,600 / 5,500 = $0.47 per correct field
- Minimax: $800 / 4,500 = $0.18 per correct field
- GPT 5.4: $2,000 / 6,500 = $0.31 per correct field
- Opus 4.7: $8,000 / 5,100 = $1.57 per correct field
Minimax M2.7 wins on cost-per-correct, even though it has the lowest absolute reliability. Mimo (free) is unbeatable during the promo. Opus 4.7 is the worst by a factor of 3.
Worked example 3: A coding task that needs 100% reliability.
The channel's framing from the coding videos: "Minimax 2.7 is 'near Opus, not Opus' — keep Sonnet or Opus reserved for the final review pass on security-sensitive or money-handling code." For a one-shot coding task that needs 100% reliability (e.g. a payment-handling function, an auth check, a database migration):
- Worker: Minimax M2.7 or Mimo V2 Pro (free) for the first pass.
- Reviewer: Opus 4.6 or Sonnet 4.6 for the final review.
The two-pass pattern catches what one pass misses. The cost: $0.05 for the worker + $0.10 for the reviewer = $0.15 per task. The reliability: near 100% (the worker's misses are caught by the reviewer). The trade-off: you pay for two models, but the absolute cost is small and the reliability is high.
The channel's overall recommendation for production-critical work: "always diff the overnight build before merging" and "keep Sonnet or Opus in reserve for the final review pass." The "good enough" rule is for the executor and auxiliary slots, not for production-critical code.
The workflow class rule
The "good enough" decision rule is operationalised by the workflow class rule: match the model to the slot. The three slots in the channel's tier list, and the right model for each:
- Orchestrator (the brain): needs a smart model that can plan multi-step work and hold state across many turns. Right pick: GPT 5.4 ($50–75/mo) or Opus (when it's working). Don't use a free model here — the orchestrator's planning errors compound across the entire workflow.
- Executor (the hands): needs a reliable model that calls tools and follows formatting instructions. Right pick: Minimax M2.7 ($10–20/mo), Mimo V2 Pro (free during promo), DeepSeek GLM 5.1 ($72/mo for code-heavy). The cost-per-success math dominates this slot.
- Auxiliary (support): needs a fast, free model for narrow, high-volume, recoverable tasks. Right pick: Gemini 2.5 Flash (default in Hermes), Gemini 3 Flash (web search), Mimo V2 Pro (high-volume), or any open-weight model you can self-host.
The workflow class rule's structural argument: each slot has a different cost-sensitivity, and the right model is the one that passes the slot's tasks for the lowest cost. The orchestrator is the most cost-sensitive in terms of error impact, even if it's the lowest in absolute volume. The auxiliary is the least cost-sensitive in terms of error impact, even if it's the highest in absolute volume.
The channel's coverage in the Top AI Models for Hermes Agent (Tier List) video is explicit:
- "Hot-swap Claude out for the orchestrator and let Mimo V2 Pro handle execution." (April 2026 verdict, after the Opus 4.7 regression.)
- "Use MiniMax M2.7 for iterative coding, multi-file refactors, long agentic loops, and Go/Rust/TypeScript/Java work." (Executor slot.)
- "Use Gemini 2.5 Flash for chat-adjacent tasks, Gemini 3 Flash for web search, Mimo V2 Pro for high-volume document processing." (Auxiliary slot.)
The Mavis verifier pattern as a multiplier
The Mavis verifier pattern (see Course 2 §2.3) is the multiplier on top of the cost / quality trade-off. The pattern: workers produce the work on a free 55% model, verifier reviews on a different free model without shared conversation history, output only ships if both pass. Effective reliability is higher than any single model in the auxiliary slot.
The math:
- Worker on Mimo V2 Pro: 55% first-pass reliability.
- Verifier on Gemini 3 Flash: 90% catch rate on the 45% worker failures.
- Effective reliability: 0.55 + 0.45 × 0.90 = 95.5%.
A 95.5% effective reliability is competitive with paid flagship models. The cost is $0, because both models are free. The total runtime is higher (worker + verifier), but for a non-time-critical auxiliary workload, the trade-off is worth it.
The channel's coverage of the verifier pattern in the Minimax Mavis: The BEST Multi-Agent Platform for Beginners video is the most explicit:
- "The way it works too with the Mavis verifier is it doesn't have shared conversation history. So, it's not messy. Every question you ask it, it's completely unbiased."
- "The best performance comes at the early context… having new agents or having a team member of teams actually makes sense here."
The structural argument: the verifier is blind to the worker's history, so the second opinion is genuinely independent. The worker can't bias the verifier by quoting its own output. The result is a higher effective reliability than any single model can provide.
Prompt caching as a multiplier
The prompt caching mechanic is the multiplier on top of the verifier pattern. Cached reads are 5–10x cheaper than uncached input, so a long-running agent on a free model with caching is sustainable. The channel's coverage in the DeepSeek v4 Flash + Hermes Agent = Surprisingly STRONG video is the most explicit:
- "V4 Flash's prompt caching is the explicit reason it became the highest-consumed token on Hermes Agent."
- "If you find yourself re-pasting the same context across runs, switch to V4 Flash and let the cache do the work."
The structural argument: a long-running agent sends the same system prompt on every turn. Without caching, the input cost is 5,000 tokens per turn (or whatever the system prompt is). With caching, the cost is 200 tokens per turn for the cached portion, plus the delta. The savings compound over thousands of turns.
For a free-tier stack, prompt caching is what makes the $0/month promise sustainable. Without caching, a 24/7 agent on Mimo V2 Pro would burn through the free quota in hours. With caching, the same agent runs for weeks on the same free quota.
The migration story
The "good enough" decision rule has a built-in expiration date: the free period will end. The migration story is the plan for when it does.
The channel's coverage of the migration:
- Mimo V2 Pro: free period is promotional and will end eventually. Estimated paid pricing: $20–40/month.
- Minimax M2.7: $10–20/month, no promotional period.
- GPT 5.4: $50–75/month, no promotional period.
- Z.AI / GLM 5.1: $30–72/month (recently doubled from $30).
The migration plan:
- Use the free period to build skills. Skills carry over to other models. The skill library is the asset.
- Test the backup model while the free period is still active. Don't wait to discover that your fallback doesn't work.
- Pick the backup based on workload class. For executor work, Minimax M2.7. For orchestrator work, GPT 5.4. For code-heavy, DeepSeek GLM 5.1.
- Migrate gradually. Don't switch the entire agent on day one. Run both models in parallel for a week, compare results, then commit.
The channel's framing: "Use Mimo now, evaluate paid alternatives when free period ends. Know your backup model. Don't become dependent. Be ready when free period ends."
The "$0/month is not a compromise" argument
The capstone argument: a $0/month agent stack is not a compromise — it's the right answer for most workloads. The structural reason: most of an agent's workload is executor and auxiliary work, not orchestrator work. The orchestrator is the smart-model slot, but the executor and auxiliary slots are the high-volume slots, and the right tool for those slots is a free model with the verifier pattern.
The channel's overall framing across all the auxiliary coverage:
- Mimo V2 Pro: the high-volume king, most-used model on OpenRouter, free during the promo. Use it for high-volume document processing.
- Gemini 2.5 Flash: default in Hermes, chat-adjacent, fast, free. Use it for chat-adjacent tasks.
- Gemini 3 Flash: free Google Search grounding, URL reading. Use it for web search.
- Elephant Alpha and Trinity Large Preview: open-weight niche picks. Use them for self-hosting or privacy-critical workflows.
The $0/month stack is a real stack, with real production use cases, and a real migration plan. It's not a toy demo, and it's not a "wait for the free period to end" placeholder. It's the structural answer for the auxiliary slot, and the channel's coverage of it is the most useful framing in this course.
The model-selection cheat sheet
A consolidated view of the channel's recommended model picks by slot and workload class, distilled from §7.1–§7.4:
Orchestrator (the brain) — needs a smart model.
- Default: GPT 5.4 ($50–75/mo). The channel's "current king" after the Opus 4.6/4.7 regression.
- Multimodal alternative: Gemini 3.1 Pro ($ moderate). Native video and audio input, the go-to for screen recordings and structured dashboard extraction.
- Free alternative: Qwen 3.6 Plus (free on Hermes Agent). Always-on reasoning, preserved thinking across turns.
- Swarm alternative: Kimi 2.5 ($39–40/mo hosted, $2/mo self-host). Self-directs a swarm of ~100 sub-agents, coordinates up to 1,500 tool calls.
- Avoid: Claude Opus 4.6/4.7. Currently in the "question mark" tier.
Executor (the hands) — needs a reliable, cheap model.
- Default (free): Mimo V2 Pro ($0 during promo). The "high-volume king," best fit for executor work where the workload is large and the misses are recoverable.
- Paid alternative: Minimax M2.7 ($10–20/mo). Best cost-per-success on the WildClaw benchmark.
- Code-heavy alternative: DeepSeek GLM 5.1 ($30–72/mo). 75%+ on coding tasks, self-corrects mid-execution.
- Speed-pick alternative: Grok (~$15/run). 94 min vs ~500 min for everything else.
- Open-weight alternative: Nemotron 3 Super (free, self-hosted). For privacy-critical coding agents.
Auxiliary (support) — needs a fast, free, narrow model.
- Default: Gemini 2.5 Flash (free, baked into Hermes). Chat-adjacent, fast, reliable.
- Web search: Gemini 3 Flash (free). Google Search grounding, URL reading.
- High-volume document processing: Mimo V2 Pro (free during promo). The "high-volume king."
- One-shot HTML: Mimo V2 Flash (free during promo). The only thing it's good at.
- Open-weight niche: Elephant Alpha (100B params, 256K context) or Trinity Large Preview (open-weight). For self-hosting or privacy-critical workflows.
- Reinforcement learning: Step 3.5 Flash (open-source, free). For self-improvement workflows.
Verifier (the multiplier) — needs an independent model.
- Default: A different auxiliary model from the worker. Workers on Mimo, verifier on Gemini 3 Flash. The two models form a free verifier pattern.
- Paid alternative: A separate paid model (GPT 5.4 mini, Minimax M2.7). The independent second opinion is the multiplier.
The cheat sheet's structural argument: the right model is the one that passes your tasks for the lowest cost, not the one with the highest benchmark score. The "good enough" decision rule is operationalised by matching the workload class to the slot, then picking the cheapest model in the slot that passes your tasks.
The "transition week" playbook
When the Mimo V2 Pro free period ends, the migration is smoother if you've already done the work. The channel's framing: "Know your backup model. Don't become dependent. Be ready when free period ends." The transition week playbook:
Day 1 (Monday): Announce the transition in the team. The free period is ending on Friday. Set the calendar reminder for the exact end date.
Day 2 (Tuesday): Test the backup model (Minimax M2.7) on the same high-volume workload. Run the WildClaw benchmark. Compare the cost / quality trade-off.
Day 3 (Wednesday): Wire the backup model in the Hermes config. Set the orchestrator to GPT 5.4, the executor to Minimax M2.7, the auxiliary to Gemini 3 Flash. Don't switch yet — just verify the wiring.
Day 4 (Thursday): Run both models in parallel. Split the workload 50/50. Compare the results in the dashboard's analytics tab.
Day 5 (Friday): Switch the executor to Minimax M2.7. Monitor for 24 hours. If the migration goes smoothly, the transition is complete. If it doesn't, fall back to Mimo (if the free period is in a grace window) or escalate to GPT 5.4 for the executor slot.
The transition week playbook's structural argument: the migration is a known, scheduled event. Don't wait for the free period to end to discover that your fallback doesn't work. Test the backup while the free period is still active. The transition is smoother if the backup is already validated, the wiring is already in place, and the parallel run has already happened.
When "best" still wins
The "good enough" decision rule doesn't mean "best" is never the right pick. There are workloads where the flagship model is the only acceptable choice:
- Production-critical code: security-sensitive or money-handling code needs a smart model for the final review pass. The channel's framing from Course 4 §4.2: "Minimax 2.7 is 'near Opus, not Opus' — keep Sonnet or Opus reserved for the final review pass on security-sensitive or money-handling code."
- One-shot complex planning: when the orchestrator is making a decision that affects the entire workflow, the smart model is worth paying for.
- High-stakes one-shot decisions: loan approvals, medical triage, anything where a single miss is catastrophic.
- Brand-new workflows with no skill library yet: the smart model is better at cold-start planning, where there's no skill to draw on.
The "good enough" rule is for the executor and auxiliary slots. The orchestrator slot can still justify a paid flagship. The structural argument: the orchestrator's planning errors compound across the entire workflow; the executor's tool calls are recoverable; the auxiliary's narrow tasks are high-volume and reviewable. The slot determines the right model, not the workload.
Try it yourself
The hands-on goal: prove the "good enough" decision rule on your own workload. Pick a high-volume task, run it on a free model, compare to a paid model, and confirm the cost / quality trade-off is worth the savings.
- Pick a high-volume workload. Document processing, batch transforms, skill-generation sweeps, anything where the workload is large and the misses are recoverable.
- Run the workload on a free auxiliary model. Mimo V2 Pro, Gemini 2.5 Flash, Gemini 3 Flash. Note the success rate, the wall-clock time, and the cost ($0).
- Add the Mavis verifier. Workers on the free model, verifier on a different free model. Note the effective reliability.
- Run the same workload on a paid model. GPT 5.4, Minimax M2.7, DeepSeek GLM 5.1. Note the success rate, the wall-clock time, and the cost.
- Compute the cost per successful run. (Cost / success rate) for each model. The cheapest cost-per-success is the right pick for this workload.
- Compare the effective reliability with the verifier. If the free + verifier combo is competitive with the paid model, you've reproduced the channel's working hypothesis: free + verifier is the right answer for high-volume auxiliary work.
- If the workload is load-bearing (production-critical, money-handling, security-sensitive), the paid model is the right pick regardless of cost. The "good enough" rule is for the executor and auxiliary slots, not the orchestrator.
- Document the result. Build a skill library of "which model is the right pick for which workload class" — the skill library is the asset, and it carries over to other models.
Common pitfalls
- Applying the "good enough" rule to the orchestrator slot. The orchestrator's planning errors compound. Don't use a free model for orchestrator work.
- Ignoring the verifier pattern. A 55% first-pass reliability is acceptable only if the verifier catches the bad outputs. Without the verifier, the 45% ships.
- Trusting a single benchmark score. The WildClaw numbers are a snapshot. Run your own benchmark on your own workload before committing.
- Skipping prompt caching. Cached reads are 5–10x cheaper than uncached input. A 24/7 agent on a free model without caching will burn through the free quota fast.
- Migrating to a paid model on day one of the free period ending. Test the backup model while the free period is still active. The transition is smoother if the backup is already validated.
- Locking in tooling that only works with one provider. Skills carry over, but skills that depend on provider-specific tool-call behaviour don't. Keep the integration thin.
- Reading "55%" as "55% good." A 55% first-pass reliability is 55% of the time the model gets it right on the first try. With the verifier pattern, the effective reliability is much higher.
- Using a free model for a one-shot critical decision. The "good enough" rule is for high-volume recoverable workloads. For one-shot critical decisions, the smart model is worth paying for.
- Treating the auxiliary slot as the only slot. The auxiliary slot is for narrow, high-volume, recoverable tasks. Production-critical code, money-handling workflows, anything that affects user trust — keep those on the orchestrator or executor slots.
- Optimising on benchmark scores instead of cost-per-success. The right model is the one that passes your tasks for the lowest cost, not the one with the highest benchmark score. The channel's framing: "use Opus only if you have an uncapped coding plan. Otherwise run WildClaw yourself."
Sources
This is the capstone article; the Sources section aggregates every video referenced across the course.
- Xiaomi MiMo V2 Pro Review: FREE AI Model That Rivals Claude Opus? —
video_id: liSNV7kPnYg· the Mimo V2 Pro-specific review - Top AI Models for Hermes Agent (Tier List) — 8,107 views ·
video_id: Af7Fg1m7hRw· cited: orchestrator / executor / auxiliary three-slot model, Gemini 2.5 Flash as default baked into Hermes, Mimo V2 Pro as high-volume king, Mimo V2 Flash good only at one-shot HTML, "hot-swap Claude out for the orchestrator and let Mimo V2 Pro handle execution" - Best Model for Openclaw (WildClaw Benchmarks!) — 4,574 views ·
video_id: 31Ij4Cum5tg· cited: 51% Opus / $80, ~65% GPT 5.4 / ~$20, 55% Mimo V2 / $26, 45% Minimax 2.7 / ~$8, 40% Grok / ~$15 in 94 min, Mimo V2 (Xiaomi) free extended access - AI Model Tier List for Agentic Workflows (April 2026) —
video_id: kOZzRRQHqR8· the full auxiliary + executor + orchestrator ranking - Hermes vs OpenClaw: Why Everyone Is Migrating — 6,116 views ·
video_id: 2NbfOOD2i1E· cited: BYOK pattern, MiniMax / Z.AI / Xiaomi Mimo as named free-tier providers, prompt caching pre-configured, 15-turn self-evolution on by default, "Anthropic doesn't tell you how many credits you've burned" - Minimax Mavis: The BEST Multi-Agent Platform for Beginners — 30,626 views ·
video_id: 86UIZVWkvF8· cited: MiniMax as Mavis substrate, $10/mo entry tier, text/image/video bundled token plan, verifier pattern without shared conversation history, "the best performance comes at the early context" - DeepSeek v4 Flash + Hermes Agent = Surprisingly STRONG — 4,893 views ·
video_id: s3Q9hvdlrmo· cited: V4 Flash as the highest-consumed token on Hermes, prompt caching as the explicit reason, "if you find yourself re-pasting the same context across runs, switch to V4 Flash" - AI Models Tier List for OpenClaw Users —
video_id: BF0B9CryUec· cross-listed — the OpenClaw-specific tier list - Top AI Models to CHOOSE (Intelligence Comparison) —
video_id: bNn35vlQpc4· cross-listed — the intelligence-focused comparison
Supabase queries used to pull transcripts/summaries:
-- Course 7 master pull (re-pulled 2026-06-17).
SELECT
video_id, title, views,
summary_content,
summary_key_takeaways,
summary_verdict,
transcript_content
FROM public.videos
WHERE video_id = ANY(ARRAY[
'liSNV7kPnYg',
'Af7Fg1m7hRw',
'31Ij4Cum5tg',
'kOZzRRQHqR8',
'2NbfOOD2i1E',
'86UIZVWkvF8',
's3Q9hvdlrmo',
'BF0B9CryUec',
'bNn35vlQpc4'
]);
-- Auxiliary slot cross-check: confirm Mimo V2 Pro, Gemini 2.5 Flash, Gemini 3 Flash, Elephant Alpha, Trinity Large Preview all appear in the auxiliary section of the tier list.
SELECT
video_id, title, summary_content
FROM public.videos
WHERE video_id = 'Af7Fg1m7hRw'
AND summary_content ~* '(auxiliary|gemini.*flash|mimo|elephant|trinity)';
-- WildClaw numbers cross-check: confirm 55% Mimo V2 / $26 / 51% Opus / $80.
SELECT
video_id, title, summary_content, summary_key_takeaways
FROM public.videos
WHERE video_id = '31Ij4Cum5tg'
AND summary_content ~* '(55%|mimo|26|opus|80)';
against project ttxdssgydwyurwwnjogq. All 9 video_ids in the course 7 syllabus have has_transcript = true and has_summary = true; the Mimo V2 Pro "high-volume king" framing, the "BYOK + prompt caching pre-configured" claim, the auxiliary slot definition, and the WildClaw numbers are sourced directly from the summary_content and summary_key_takeaways columns.
Other doc references cited (from the source videos):
github.com/NousResearch/hermes-agent— the Hermes Agent repo, confirmed by the v0.14.0 release link inpublic.ai_updates(AI Briefing 2026-05-17). The "Migrating from OpenClaw" section and the~/.hermes/paths in §7.2 are transcript-quoted from2NbfOOD2i1E.nano config.yaml— the Hermes config file containing the default Gemini 2.5 Flash auxiliary model. Referenced in the tier list video transcript.- Hermes dashboard (port 9119) — the in-browser UI showing the cache hit rate, request counts, and per-model spend. See Course 3 §3.4 for the full breakdown.
- Mavis verifier pattern — the orchestrator + adversarial verifier architecture where workers produce, a separate agent reviews from first principles without shared conversation history. Covered in Course 2 §2.3 and §7.2 above.
public.ai_models cross-check:
Confirmed rows used in §7.1–§7.4:
xiaomi-mimo(Mimo V2 Pro, vendor Xiaomi)gemini-2-5-flash(Google)gemini-3-flash(Google)minimax(MiniMax M2.7)minimax-m2-5(MiniMax M2.5)glm-5-1(Zhipu AI / Z.AI)claude-opus-4-6(Anthropic)claude-opus-4-7(Anthropic)grok(xAI)elephant-alpha(open-weight, 100B params)trinity-large-preview(open-weight)openai(GPT-5.4)
Vendor names used in the article cross-match these rows. The pricing_info column is null for every row pulled — the $0/$10–20/$50–75/$200+ pricing tiers cited in §7.4 come from the video transcripts, the Mimo V2 Pro review, and the channel's published tier list, not from the DB.
public.ai_updates cross-check:
- AI Briefing 2026-04-24 (Hermes v0.11.0 release notes) — auxiliary model list expansion as part of the v0.11 release arc.
- AI Briefing 2026-05-01 (Hermes v0.12.0 "The Curator" release notes) — "autonomous skill maintenance, 4 new providers, ~57% cold-start reduction". The "4 new providers" is the structural reason prompt caching ships pre-configured for new BYOK providers in the curator release.
- AI Briefing 2026-05-17 (Hermes v0.14.0 "The Foundation Release" release notes) — confirms the
github.com/NousResearch/hermes-agentURL and the "Migrating from OpenClaw" section in the README.
NOTE on time-stamped claims: the Mimo V2 Pro free period end date, the Mimo V2 Pro estimated $20–40/month paid pricing, the MiniMax M2.7 $10–20/month tier, the GPT 5.4 $50–75/month tier, the Z.AI $30→$72/month price move, the Anthropic Opus 4.7 "question mark" status, and the channel's "hot-swap Claude out" recommendation are all drawn from the source videos cited above. These are time-stamped claims — re-check the official Xiaomi / MiniMax / OpenAI / Z.AI / Anthropic documentation if you read this article after a new release. The structural arguments (the three-slot model, the workflow class rule, the Mavis verifier pattern, the cost / quality trade-off) are stable; the specific numbers will drift.