Claude Code users hitting usage limits 'way faster than expected'
Usage limits & suspected bugs
- Many users report hitting Claude Code limits dramatically faster than before, sometimes after a single small query or a few prompts, even on paid plans (Pro/Max).
- Some see large percentage jumps (e.g., 0% → 12% for a trivial prompt) and inconsistent day‑to‑day consumption.
- A reverse‑engineering effort claims there are cache bugs:
- Certain “magic strings” (e.g., about billing/tokens) in a conversation may invalidate KV cache and force full context reloads.
- Using
--resumein large conversations may rebuild the entire conversation cache, making resumption far more expensive than expected.
- Others note unusually slow or looping behavior and retries that never succeed unless manually restarted.
- Anthropic has acknowledged investigations into limits “hitting faster,” but users say refunds/adjustments are unclear or absent.
Pricing, transparency & “token anxiety”
- People complain they can’t see or predict real token usage or hard limits; Anthropic only describes tiers relative to each other.
- This unpredictability creates “token anxiety” and makes planning deep work sessions difficult.
- Some suspect not just bugs but quiet tightening of quotas or “boiling the frog”–style pricing experiments; others think demand/compute costs or fixes to earlier under‑counting are more plausible.
- The 5‑hour usage window, differing peak/non‑peak rates, and “extra usage” upsells that auto‑enable for some users add to mistrust.
User experience & value
- Many find Claude Code highly capable for complex, agentic coding and codebase reasoning, often outperforming other tools.
- Others see it as a “token hog,” preferring manual context curation over fully agentic flows.
- For heavier coding, multiple users say Pro is now unusable; they hit limits quickly even on modest projects.
Alternatives, local models & routing
- Numerous comments discuss routing easy tasks (summaries, translations) to cheaper/open‑weight models (Qwen, GLM, Kimi, etc.) via providers/routers and reserving Claude/Opus for hard problems.
- Some argue open‑weight and local models are improving fast and can already replace proprietary tools for many workloads; others say they still lag for “real engineering” on large codebases.
Trust, support & long‑term concerns
- Users criticize poor customer support (AI front‑ends, difficulty reaching humans) and lack of quota remediation.
- Broader worries: dependence on a single vendor, future price hikes, dynamic/personalized pricing, and the push toward on‑prem or local‑first strategies to regain control.