I cancelled Claude: Token issues, declining quality, and poor support
Token limits, pricing & usage patterns
- Many Pro ($20) users report Claude Code burning through 5‑hour session limits in minutes, sometimes on a single prompt, especially when agents spin up multiple background tasks or read large repos.
- Some Max users (5x/10x/20x) say they rarely hit limits even with heavy daily use; others on the same tiers find metering opaque and inconsistent.
- People complain about silent behavior changes (effort defaults, cache TTLs) that change token usage without notice.
- Several feel Pro’s Claude Code access is “technically there but unusable,” pushing them toward pricier Max or other vendors.
Perceived quality regression & model behavior
- Many say Opus 4.5–early 4.6 was a “peak”; later 4.6/4.7 feel more forgetful, lazier, more prone to shortcuts, over‑editing, and subtle bugs.
- Anthropic’s own postmortems (thinking effort default change, cache bug, verbosity change) are cited as partial explanations; some think broader degradation and frequent silent tuning continue.
- Others report no noticeable decline and find current Opus 4.7 on xhigh/max effort excellent, especially in Claude Code.
- Some suspect “adaptive reasoning” or routing to cheaper back‑end models; others call this paranoia or misinterpretation.
Workflow differences: copilot vs autopilot
- A major split:
- Copilot users give small, well‑scoped tasks, prune context, and review everything; they rarely hit limits and are happy with quality.
- Autopilot/“vibe coding” users let agents roam large codebases for hours; they see token explosions, tangents, duplicative code, fragile fixes, and lose trust.
- Several argue that reviewing AI‑generated code is harder than writing it, making net productivity negative for serious systems; others say agentic workflows produce months of work in days if you manage them carefully.
Alternatives & local models
- Codex (with recent GPT‑5.4/5.5) is frequently cited as a strong or superior coding alternative; some shift most work there.
- Kimi 2.6, DeepSeek v4, GLM, Qwen3.6, Gemini, and others are mentioned as cheaper or “good enough,” often accessed via tools like OpenRouter, Opencode, Cursor, Pi.dev, Swival, and Crush.
- Many are experimenting with local models (Qwen, Gemma, etc.) via LM Studio, OMLX, vLLM, llama.cpp; capability is improving but hardware and setup remain non‑trivial.
Business model, lock‑in & enshittification fears
- Strong concern that all major LLM vendors are subsidizing now and will later jack prices, cut limits, or degrade quality once users are dependent (“vcware,” “enshittification,” “crack dealer” analogies).
- Debate over closed vs open models: some insist open weights lag far behind SOTA; others say the gap is <10–20% and shrinking, with cost and sovereignty advantages.
Support & reliability
- Multiple reports of poor or non‑existent human support: no refunds for failed generations or token mischarges, slow/absent responses, AI‑only frontline.
- Outages, latency spikes, and opaque failures (e.g., long jobs ending in token‑limit errors) erode trust, especially for people building workflows or businesses around Claude Code.