I cancelled Claude: Token issues, declining quality, and poor support

Token limits, pricing & usage patterns

  • Many Pro ($20) users report Claude Code burning through 5‑hour session limits in minutes, sometimes on a single prompt, especially when agents spin up multiple background tasks or read large repos.
  • Some Max users (5x/10x/20x) say they rarely hit limits even with heavy daily use; others on the same tiers find metering opaque and inconsistent.
  • People complain about silent behavior changes (effort defaults, cache TTLs) that change token usage without notice.
  • Several feel Pro’s Claude Code access is “technically there but unusable,” pushing them toward pricier Max or other vendors.

Perceived quality regression & model behavior

  • Many say Opus 4.5–early 4.6 was a “peak”; later 4.6/4.7 feel more forgetful, lazier, more prone to shortcuts, over‑editing, and subtle bugs.
  • Anthropic’s own postmortems (thinking effort default change, cache bug, verbosity change) are cited as partial explanations; some think broader degradation and frequent silent tuning continue.
  • Others report no noticeable decline and find current Opus 4.7 on xhigh/max effort excellent, especially in Claude Code.
  • Some suspect “adaptive reasoning” or routing to cheaper back‑end models; others call this paranoia or misinterpretation.

Workflow differences: copilot vs autopilot

  • A major split:
    • Copilot users give small, well‑scoped tasks, prune context, and review everything; they rarely hit limits and are happy with quality.
    • Autopilot/“vibe coding” users let agents roam large codebases for hours; they see token explosions, tangents, duplicative code, fragile fixes, and lose trust.
  • Several argue that reviewing AI‑generated code is harder than writing it, making net productivity negative for serious systems; others say agentic workflows produce months of work in days if you manage them carefully.

Alternatives & local models

  • Codex (with recent GPT‑5.4/5.5) is frequently cited as a strong or superior coding alternative; some shift most work there.
  • Kimi 2.6, DeepSeek v4, GLM, Qwen3.6, Gemini, and others are mentioned as cheaper or “good enough,” often accessed via tools like OpenRouter, Opencode, Cursor, Pi.dev, Swival, and Crush.
  • Many are experimenting with local models (Qwen, Gemma, etc.) via LM Studio, OMLX, vLLM, llama.cpp; capability is improving but hardware and setup remain non‑trivial.

Business model, lock‑in & enshittification fears

  • Strong concern that all major LLM vendors are subsidizing now and will later jack prices, cut limits, or degrade quality once users are dependent (“vcware,” “enshittification,” “crack dealer” analogies).
  • Debate over closed vs open models: some insist open weights lag far behind SOTA; others say the gap is <10–20% and shrinking, with cost and sovereignty advantages.

Support & reliability

  • Multiple reports of poor or non‑existent human support: no refunds for failed generations or token mischarges, slow/absent responses, AI‑only frontline.
  • Outages, latency spikes, and opaque failures (e.g., long jobs ending in token‑limit errors) erode trust, especially for people building workflows or businesses around Claude Code.