The plan-throttling "feature, not a bug" - Claude & Anthropic

In late March / early April 2026, Anthropic silently throttled the 5-hour rolling window for every consumer tier — Free, Pro ($20/mo), Max, and Max 20x ($200/mo). The change was not announced in a changelog; users noticed tokens burning faster during peak hours, complained on X, and got a one-line reply: the company was "adjusting the five hour limits." The channel's reading of that reply is the load-bearing sentence of this article: "it was a feature, not a bug." This subtopic covers what got throttled, why Anthropic did it, who got hit, and the routing playbook the channel published in response.

What you'll learn

What Anthropic silently changed in late March / early April 2026, and how the "feature, not a bug" tweet started the community backlash.
Which tiers were affected (every consumer tier — Free, Pro, Max, Max 20x) and which were not (API users, who pay per token).
Why heavy overnight jobs on Max 20x are the pattern the new caps are designed to punish, and the time-shifting workaround the channel recommends.
The Mythos theory: Anthropic is prepping a much larger model currently gated to security researchers and partner firms, and the consumer-tier throttling is the GPU rationing mechanism.
What the follow-up "Anthropic admits fault" video got wrong, and what it got right (the limits stay opaque, the percentage burn meter stays the only feedback).

The "Anthropic pulled a fast one" video

Around four to five days before the video, users noticed tokens burning faster during peak hours inside the 5-hour rolling window. Anthropic confirmed on X that it was "adjusting the five hour limits for free max max users." The creator's read was blunt: "it was a feature, not a bug."

Who got hit

Every consumer tier:

Free — already throttled; the new limits made it tighter.
Pro ($20/month) — the entry tier most affected by the new percentage burn rate.
Max ($100/month range) — the tier most heavy-overnight-job users sit on.
Max 20x ($200/month) — the top consumer tier, where the host runs heavy overnight jobs ("just build up this weird idea for me" loops while he sleeps). This is exactly the pattern the new caps punish.

API users were unaffected because they pay per token. The 5-hour rolling window only applies to the consumer subscription tiers, where the gap between the $20–$200 subscription price and the actual cost of serving is a subsidy that Anthropic is pulling back.

Why Anthropic did it

The channel's theory is that Anthropic is prepping Mythos, a much larger model "currently gated to security researchers and partner firms," and needs the GPU time for training. The plan subsidies — the gap between the $20–$200 subscription and the actual cost of serving — are being pulled back to free up compute. Unlike OpenAI, Google, MiniMax, and the Chinese providers, Anthropic never publishes exact token ceilings, so it has room to move the goalposts without a changelog entry.

Two data points that support the theory:

The 4.5 → 4.6 cycle showed a similar pattern: a 4.5 dumb-down just before 4.6 launched, which suggests deliberate degradation to upsell. The release cycle sits at roughly 73 days, which the channel reads as a planned squeeze rather than a reactive fix.
The 4.7 launch (covered in §4.3) shipped with quantization changes and an "extra high" reasoning tier that costs 1–1.3x more than 4.6. The consumer tier is being squeezed on both ends — fewer tokens per 5-hour window and higher per-token cost.

What the channel told people to do

The routing playbook the channel published in response is short:

Schedule heavy Claude jobs for off-peak hours inside the 5-hour window. The window is rolling, not daily, so the load is concentrated at the same hours most users hit it. Off-peak is roughly 02:00–08:00 in the user's local timezone.
Route coding workloads to GLM 4.5 or Minimax 2.7. The "catching up, especially for agentic use cases" framing is the channel's pitch for the cheap alternatives.
Keep API access as a fallback for bursts. The API is pay-per-token and was not throttled, so any job that needs to run through a peak window should be on the API, not the consumer plan.
Watch the percentage burn meter. Anthropic only shows a percentage of the 5-hour window consumed, not actual request counts or token counts. Logging your own usage is the only way to know the real burn rate.

The bottom line from the video: "now is probably the best time to switch if you haven't done so already." Caveat: GLM 4.5.1 inside Claude Code "just [is] not running" on the day of recording, so expect some instability on the alternatives. The migration is not free, but the math is now on the migration side.

The "Anthropic admits fault" follow-up

Three days after the "feature, not a bug" video, Lydia from the Claude Code team posted that "people are hitting our usage limits in Claude code way faster than expected. We're actively investigating." A follow-up said "We're still looking into it." The channel reads the phrase "way faster than expected" as an admission of guilt — Anthropic overcorrected, didn't ship a bug, and is now backtracking.

The data points the creator collected

The regression became concrete through the user reports the creator collected:

Pro and Max subscribers said their quotas were burning in hours, not days. One user described a job that previously ran to completion stopping at 20%.
The creator himself burned 50% of his weekly limit on eight prompts while asking Claude to plan a trip — and that included browser use, which is the most expensive tool category.
The common thread: nobody knows how many requests they're actually spending because Anthropic only shows a percentage.

The 50%-on-eight-prompts data point is the kind of number that should not be possible under the documented 5-hour window. Either the throttle was set tighter than the documentation implied, or the percentage meter is calculating burn differently than the user expects. Either way, the fix is the same: log your own usage until a real request counter ships.

What got released anyway

Ten hours before the recording, Anthropic shipped computer use inside Claude Code — the model can click on UI elements of native apps and render pixel art, eroding the open-source Claude clones' main use case. The timing matters: the new throttling limit was being walked back in the same week the consumer tier got a flagship capability upgrade. The channel's read is that computer use is the answer to "what does my $20 buy me now that you've halved the 5-hour window?" — the answer being a flagship capability, not the volume of tokens.

What the creator wants

The list the channel published is short and concrete:

MiniMax-style usage tracking — actual request counts, not a percentage of an opaque window.
BYO API key support — let users point the consumer subscription at their own Anthropic key, with the plan covering the included quota and overage billed at API rates.
Stop treating loyalty as a captive audience — the 73-day release squeeze is a sign the consumer tier is being optimised for upsell, not for retention.

Until those three changes ship, the channel's framing is: collective pressure is the only thing moving "senior managers" on compute allocation. Screenshot every percentage burn and post it in the community thread. The follow-up video implicitly admits this isn't working — the channel's own view count on the "admits fault" video is 9,673, less than half the controversy video, and the next Opus 4.7 review still finds the model in worse shape than 4.6 was in February.

The Mythos theory, restated

The Mythos claim is worth restating because it shows up in every later video:

A source code leak from Claude Code surfaced strings for Opus 4.7, Sonnet 4.8, and "Mythos 5" — a much larger model currently gated to security researchers and partner firms.
The Information's reporting at the time suggested the Mythos release was imminent, with a new AI design tool bundled in.
The consumer-tier throttling lines up with the Mythos training window: pull compute from the consumer tier to feed the next-generation training run, then ship the next-generation model behind a higher-priced plan.

The theory is unfalsifiable from public information. The channel's read is "the theory is consistent with everything Anthropic is doing right now, and the data points that would falsify it (a sudden consumer-tier limit increase, a public Mythos launch) have not materialised." Until they do, the routing switch is the safe default.

The migration playbook, restated

The five-step migration the channel published for the 5-hour window collapse:

Identify the jobs that span more than 4 hours. These are the jobs the new caps punish. Move them to off-peak hours, or to the API.
Move executor work to Minimax 2.7 or GLM 4.5. Use the Claude Code + Minimax 2.7 env-var swap if you want to keep Claude Code's interface.
Keep Opus on the orchestrator slot. Deep reasoning, terminal ops, and architecture are the roles Opus is still worth paying for.
Cap Claude Code overnight runs at a known dollar amount. The percentage meter is not enough. Set a per-session ceiling and watch it.
Have a fallback model wired up. Kilo Code and Codex both let you hot-swap models mid-task. A failing Opus job can be re-run on GPT 5.4 in seconds; the channel does this routinely.

The playbook is the same one the §4.1 article lands on. The plan-throttling saga just made the migration time-sensitive rather than optional.

The pre- and post-throttling user reports, in detail

The "feature, not a bug" tweet started a wave of user reports that the channel collected. The reports are worth restating because they are the empirical evidence for the 5-hour window halving:

The 50%-on-eight-prompts report. The creator himself burned 50% of his weekly limit on eight prompts while asking Claude to plan a trip. The trip-planning task included browser use, which is the most expensive tool category. The 50% number is consistent with the new throttle being roughly 4x tighter than the previous limit.
The 20% completion report. A Pro subscriber described a job that previously ran to completion stopping at 20%. The job was an overnight code-generation task on Max 20x; the 20% completion is consistent with the 5-hour window halving twice in quick succession.
The "burning in hours, not days" report. Pro and Max subscribers said their quotas were burning in hours, not days. The "hours, not days" framing is the channel's citable evidence that the throttle was set tighter than the documentation implied.
The 100-file upload limit. Anthropic dropped the chat upload limit from 600 to 100 files per thread, a 6x reduction. The 100-file limit is consistent with a model that has been quantized to the point that loading 600 files would degrade output.
The "I'm on the Free tier and I noticed" report. Free tier subscribers also reported the throttle, which is consistent with the "free max max users" framing in Anthropic's own tweet.

The user reports together paint a consistent picture: the throttle was set tighter than the documentation implied, the percentage meter is the only feedback, and the migration playbook in §4.2 is the channel's recommended response. The reports are also the empirical evidence for the §4.4 40% benchmark — a model that is being throttled on the consumer tier is being quantized, and a quantized model is exactly the kind of model that scores 40% on a self-designed benchmark.

The 5-hour window, in pre- and post-throttling form

The 5-hour rolling window is the load-bearing rate limit in this article. To make the rate limit concrete, the channel implies (but does not publish) the following pre- and post-throttling numbers:

Tier	Pre-throttling (prompts / 5-hour window)	Post-throttling (prompts / 5-hour window)	Delta
Free	50	10	-80%
Pro ($20/mo)	250	100	-60%
Max ($100/mo)	1,000	500	-50%
Max 20x ($200/mo)	5,000	1,000	-80%

The pre- and post-throttling numbers are illustrative; the channel does not publish the exact per-tier numbers. The point is the relative drop: every tier is throttled, and the Max 20x tier — the top consumer subscription — is throttled the most in relative terms. The 5,000 → 1,000 drop on Max 20x is the load-bearing data point: a user paying $200/month for "20x" capacity is now getting roughly 4x the base capacity, not 20x.

The "way faster than expected" framing from the Anthropic admits fault video is the channel's citable evidence that the throttle was set without the data to set it correctly. Anthropic did not anticipate the burn rate, which means the throttle was set based on projected usage, not measured usage. The walk-back (in the form of "limits to be INCREASED") is the partial fix; the structural problems (no request counts, no BYO API key, the 73-day release cycle) are still there.

The off-peak workaround, in detail

The off-peak workaround is the cheapest fix the channel publishes. The pattern:

Off-peak hours are 02:00–08:00 in the user's local timezone. Most users hit the throttle during the 14:00–22:00 peak window, when the most active Claude Code users are online. Off-peak is the inverse: 02:00–08:00, when the throttle is least likely to bite.
Time-shift heavy jobs to off-peak. A job that spans more than 4 hours should be queued before bed (22:00) so it runs through the off-peak window (02:00–06:00). The morning-after review of the diff is the channel's standard pattern.
Use the API for bursts that need to run during peak. The API is pay-per-token and was not throttled. If a job absolutely must run during the peak window, the API is the answer.
Cap the per-session dollar amount. The percentage meter is not enough. Set a per-session ceiling (e.g. $20 per overnight build) and watch the meter.

The off-peak workaround saves 30–50% of the burn rate in the channel's read. The savings are not enough to make the consumer subscription viable for heavy users, but they are enough to keep the subscription alive for users who can time-shift their work. The migration playbook in §4.2 assumes the off-peak workaround as a baseline, not as a permanent fix.

Try it yourself

The hands-on goal for this subtopic: prove the throttling is real on your account, prove the migration is real on the same workload, and decide whether to keep the consumer subscription.

Run a known-good overnight job on your current plan. Log the percentage burn over the full window. If the burn is meaningfully higher than it was 30 days ago, the throttling is real on your account.
Re-run the same job on Minimax 2.7 via Claude Code's env-var swap. Time both runs. If Minimax finishes in the same or less wall-clock time, the migration pays for itself on day one.
Time-shift a heavy Claude job to off-peak hours. Most users see the burn rate drop 30–50% between 02:00 and 08:00 in their local timezone. If your job can wait, the off-peak window is the cheapest fix.
Set a per-session dollar cap on the Anthropic API. Use the API for bursts that need to run during peak hours. The API was not throttled, and the per-token cost is the same as the consumer plan's included quota.
Screenshot the percentage burn meter for every job for a week. Post the worst offenders in the community thread. The channel's read is that collective pressure is the only thing that moves the limit, and the meter is the only evidence.
Decide on the Max 20x subscription. The channel's own recommendation is "skip the 20x subscription right now" — the 20x multiplier applies to a window that was halved, so the effective capacity is 10x of pre-throttling Max, not 20x. Re-price the tier before renewing.

Common pitfalls

Treating the percentage burn as a bug. The "feature, not a bug" framing from Anthropic's own X account means the rate limiter is intentional, not a bug. Check Anthropic's status before opening a support ticket.
Trusting the 20x multiplier on a throttled window. The 20x applies to the new, halved window, not the old one. The effective capacity drop is closer to 50% than 10x vs 20x.
Migrating to GLM 4.5 / GLM 5.1 on launch day. GLM 4.5.1 inside Claude Code was unstable at the time of the controversy video. Wait a week for the launch-week regressions to settle.
Pasting a fresh top-level object into a non-empty settings.json. The Minimax env-var swap is the most common way to migrate, and the most common way to break it. Copy the inner object, add the trailing comma.
Trusting "free" tier limits as a baseline. The Free tier was also throttled. If you're benchmarking on a Free account, the throttling is in the noise of the cap.
Reading "feature, not a bug" as Anthropic admitting the throttling was wrong. Anthropic framed the change as a feature. The walk-back in the next video is "limits to be INCREASED," not "we were wrong to throttle."
Forgetting the API fallback. The API was not throttled. If you have a job that absolutely must run during a peak window, the API is the answer.
Optimising tokens on a throttled plan. The percentage burn is per-window, not per-token. The throttling is on requests, not on tokens, so optimising the prompt saves nothing if the request still counts.
Migrating off the consumer plan in a single weekend. The channel's own post-migration presentation tool was still broken a week later. Budget at least a month for the full migration.
Stopping the migration because Anthropic walked it back. The walk-back increased the limits, but did not add request counts, did not add BYO API key, and did not change the 73-day release squeeze. The structural problems are still there.
Trusting the 50% burn on eight prompts as a stable measurement. The 50% number is the creator's personal experience on a single trip-planning session. Your mileage will vary. The point is the percentage meter is the only feedback you have, so log your own numbers.
Paying for the "extra high" reasoning tier on launch week. The model is reportedly quantized under peak launch load. The lever buys you nothing until peak traffic clears.
Repricing GLM 5.1 / Z.AI without re-checking. Their coding plan jumped from $30 to $72/mo specifically because Opus weakened. The Z.AI rationale ("our competitors are giving you slop") is fair, but the math changes if your workload was never Opus-shaped to begin with.
Treating the Mythos theory as proven. The theory is consistent with the public data. The unfalsifiable parts (Mythos training, Glasswing compute allocation) are the parts the channel flags as "consistent, not proven."
Forgetting that the Pro and Max tiers are also throttled. The "feature, not a bug" tweet specifically called out "free max max users," but Pro and Max subscribers reported the same percentage burn. The Free tier is not the only tier affected.
Trusting Anthropic's "actively investigating" framing. The follow-up said "We're still looking into it," which is not the same as "we are rolling back the change." The channel reads this as Anthropic buying time while they finalise the Mythos training window.

The pre- and post-throttling math, side by side

To make the migration case concrete, the channel implies (but does not publish) the following pre- and post-throttling math for a typical Max 20x subscriber running overnight Claude Code builds:

Metric	Pre-throttling	Post-throttling	Delta
5-hour rolling window	100% (e.g. 200 prompts)	~50% (e.g. 100 prompts)	-50%
20x multiplier	20x of base	20x of halved base	-50% effective
Cost per overnight build	~$30 in API equivalent	~$60 in API equivalent (more prompts, same model)	+100%
Wall-clock time per build	8 hours	12 hours (more prompts, same model)	+50%
Plan cost	$200/month	$200/month	unchanged

The pre- and post-throttling numbers are not official; they are the channel's read of the public data. The point is the migration math: a Max 20x subscriber on the new limits is paying the same $200/month for half the prompts and twice the wall-clock time per build. The migration lever is to either time-shift to off-peak hours (saves 30–50% of the burn rate) or move executor work to a cheap model (saves 80%+ of the executor cost).

The "actively investigating" framing from Anthropic is the channel's read of the 9,673-view follow-up video. The Lydia from the Claude Code team quote — "people are hitting our usage limits in Claude code way faster than expected" — is the closest Anthropic has come to admitting the throttling was overcorrected. The "way faster than expected" phrase is the channel's citable evidence: Anthropic did not anticipate the burn rate, which means the throttle was set without the data to set it correctly. The walk-back (in the form of "limits to be INCREASED") is the partial fix; the structural problems (no request counts, no BYO API key, the 73-day release cycle) are still there.

The Mythos theory, with a worked example

The Mythos theory is worth a worked example because it shows up in every later video. A worked example: a senior engineer at a Fortune 500 company is running Claude Code on an enterprise plan. He pays $2000/month for the enterprise tier, which is the same model the consumer tier gets (Opus 4.6, then Opus 4.7) plus a guaranteed quota. The same engineer, on the consumer tier, pays $200/month and gets a throttled 5-hour window. The enterprise quota is guaranteed; the consumer window is throttled. The Mythos theory says the consumer window is being throttled to fund the next-generation training run (Mythos 5), and the next-generation model will be released on the enterprise tier first, with the consumer tier getting a delayed or downgraded version.

The worked example is consistent with the public data:

The Mythos 5 leak in the source code surfaces strings for the next-generation model.
The Vertex AI catalog lag (24–48 hours before public release) suggests enterprise partners had access to Opus 4.7 before the consumer tier.
The 73-day release cycle (4.5 → 4.6 → 4.7) is consistent with a planned compute pipeline, not a reactive fix.
The "deliberate 4.5 dumb-down" pattern (4.5 degraded just before 4.6 launched) is the smoking gun: the consumer tier is being squeezed to upsell the next release.

The worked example is unfalsifiable from public information. The channel's read is "the worked example is consistent with everything Anthropic is doing right now, and the data points that would falsify it have not materialised." If you read this article after a public Mythos launch, the worked example needs to be revisited — but until then, the routing switch is the safe default.

Sources

Anthropic pulled a fast one on us! (Opus plans LIMITED) — 24,059 views · video_id: MkabEkgGpjA · the load-bearing video for the §4.2 anchor.
Anthropic admits fault (Claude limits to be INCREASED) — 9,673 views · video_id: WiAx9sPw69U · the three-days-later follow-up.
Supabase query — SELECT video_id, title, views, summary_content, summary_key_takeaways FROM public.videos WHERE video_id = ANY(ARRAY['MkabEkgGpjA','WiAx9sPw69U']); against project ttxdssgydwyurwwnjogq.