2026-04-24

I cancelled Claude: Token issues, declining quality, and poor support

Token limits, pricing & usage patterns

Many Pro ($20) users report Claude Code burning through 5‑hour session limits in minutes, sometimes on a single prompt, especially when agents spin up multiple background tasks or read large repos.
Some Max users (5x/10x/20x) say they rarely hit limits even with heavy daily use; others on the same tiers find metering opaque and inconsistent.
People complain about silent behavior changes (effort defaults, cache TTLs) that change token usage without notice.
Several feel Pro’s Claude Code access is “technically there but unusable,” pushing them toward pricier Max or other vendors.

Perceived quality regression & model behavior

Many say Opus 4.5–early 4.6 was a “peak”; later 4.6/4.7 feel more forgetful, lazier, more prone to shortcuts, over‑editing, and subtle bugs.
Anthropic’s own postmortems (thinking effort default change, cache bug, verbosity change) are cited as partial explanations; some think broader degradation and frequent silent tuning continue.
Others report no noticeable decline and find current Opus 4.7 on xhigh/max effort excellent, especially in Claude Code.
Some suspect “adaptive reasoning” or routing to cheaper back‑end models; others call this paranoia or misinterpretation.

Workflow differences: copilot vs autopilot

A major split:
- Copilot users give small, well‑scoped tasks, prune context, and review everything; they rarely hit limits and are happy with quality.
- Autopilot/“vibe coding” users let agents roam large codebases for hours; they see token explosions, tangents, duplicative code, fragile fixes, and lose trust.
Several argue that reviewing AI‑generated code is harder than writing it, making net productivity negative for serious systems; others say agentic workflows produce months of work in days if you manage them carefully.

Alternatives & local models

Codex (with recent GPT‑5.4/5.5) is frequently cited as a strong or superior coding alternative; some shift most work there.
Kimi 2.6, DeepSeek v4, GLM, Qwen3.6, Gemini, and others are mentioned as cheaper or “good enough,” often accessed via tools like OpenRouter, Opencode, Cursor, Pi.dev, Swival, and Crush.
Many are experimenting with local models (Qwen, Gemma, etc.) via LM Studio, OMLX, vLLM, llama.cpp; capability is improving but hardware and setup remain non‑trivial.

Business model, lock‑in & enshittification fears

Strong concern that all major LLM vendors are subsidizing now and will later jack prices, cut limits, or degrade quality once users are dependent (“vcware,” “enshittification,” “crack dealer” analogies).
Debate over closed vs open models: some insist open weights lag far behind SOTA; others say the gap is <10–20% and shrinking, with cost and sovereignty advantages.

Support & reliability

Multiple reports of poor or non‑existent human support: no refunds for failed generations or token mischarges, slow/absent responses, AI‑only frontline.
Outages, latency spikes, and opaque failures (e.g., long jobs ending in token‑limit errors) erode trust, especially for people building workflows or businesses around Claude Code.

Related topics