Claude Code users hitting usage limits 'way faster than expected'

Usage limits & suspected bugs

  • Many users report hitting Claude Code limits dramatically faster than before, sometimes after a single small query or a few prompts, even on paid plans (Pro/Max).
  • Some see large percentage jumps (e.g., 0% → 12% for a trivial prompt) and inconsistent day‑to‑day consumption.
  • A reverse‑engineering effort claims there are cache bugs:
    • Certain “magic strings” (e.g., about billing/tokens) in a conversation may invalidate KV cache and force full context reloads.
    • Using --resume in large conversations may rebuild the entire conversation cache, making resumption far more expensive than expected.
  • Others note unusually slow or looping behavior and retries that never succeed unless manually restarted.
  • Anthropic has acknowledged investigations into limits “hitting faster,” but users say refunds/adjustments are unclear or absent.

Pricing, transparency & “token anxiety”

  • People complain they can’t see or predict real token usage or hard limits; Anthropic only describes tiers relative to each other.
  • This unpredictability creates “token anxiety” and makes planning deep work sessions difficult.
  • Some suspect not just bugs but quiet tightening of quotas or “boiling the frog”–style pricing experiments; others think demand/compute costs or fixes to earlier under‑counting are more plausible.
  • The 5‑hour usage window, differing peak/non‑peak rates, and “extra usage” upsells that auto‑enable for some users add to mistrust.

User experience & value

  • Many find Claude Code highly capable for complex, agentic coding and codebase reasoning, often outperforming other tools.
  • Others see it as a “token hog,” preferring manual context curation over fully agentic flows.
  • For heavier coding, multiple users say Pro is now unusable; they hit limits quickly even on modest projects.

Alternatives, local models & routing

  • Numerous comments discuss routing easy tasks (summaries, translations) to cheaper/open‑weight models (Qwen, GLM, Kimi, etc.) via providers/routers and reserving Claude/Opus for hard problems.
  • Some argue open‑weight and local models are improving fast and can already replace proprietary tools for many workloads; others say they still lag for “real engineering” on large codebases.

Trust, support & long‑term concerns

  • Users criticize poor customer support (AI front‑ends, difficulty reaching humans) and lack of quota remediation.
  • Broader worries: dependence on a single vendor, future price hikes, dynamic/personalized pricing, and the push toward on‑prem or local‑first strategies to regain control.