Developers are choosing older AI models
Scope and Data Skepticism
- Several commenters see the article’s conclusions as weakly supported: it’s based on ~one week of post‑release data, only from one tool’s users, and largely excludes local models and non‑Claude/OpenAI usage.
- Some argue the headline overgeneralizes; others note the underlying observation (users spiking on Sonnet 4.5 then drifting back to 4.0) is valid but too narrow to explain industry‑wide behavior.
Speed, Latency, and “Thinking” Overhead
- Speed is repeatedly cited as decisive. New “reasoning” models (GPT‑5, Sonnet 4.5, GLM 4.6, etc.) are described as slower, chattier, and prone to verbose internal “thinking” that users don’t always want.
- Many prefer faster “older” or cheaper models for straightforward tasks, reserving heavy reasoning models only for genuinely complex problems.
- Some predict UX will trend toward instant initial answers with optional deeper drill‑downs, not default multi‑step reasoning.
Reliability, Instruction Following, and Regressions
- Several report newer models performing worse on real tasks:
- Sonnet 4.5 seen as less reliable than Opus 4.1 and even Sonnet 4.0 for coding; some canceled subscriptions over this and reduced usage limits.
- GPT‑5 described as worse than GPT‑4.1 for long‑context RAG (weaker instruction following, overly long answers, smaller effective context).
- Others complain of degraded behavior in tools (e.g., more needless flowcharts, “UI options,” or sycophantic language).
- A few disagree, saying Sonnet 4.5 and GPT‑5 are clear upgrades for complex reasoning, but even they note trade‑offs.
Multi‑Model and Local‑Model Strategies
- Many developers use multiple models: one for planning/reasoning, another for execution, others for speed or specific codebases. Tools that make model‑switching easy are praised.
- Local models (e.g., small Qwen variants, Granite, Llama‑family) are increasingly used for privacy, cost control, and “good enough” tasks, though most agree they still lag top cloud models for hard coding.
Costs, Limits, and Business Dynamics
- Token‑based pricing, lower usage caps (especially on premium tiers), and model verbosity incentivize using lighter or older models.
- Some see a “drift” or “enshittification” pattern: newer models optimized for safety, alignment, and monetization lose some decisiveness and task fidelity.
- A minority speculate this dynamic—plus possible performance plateaus and data pollution—could help deflate the current AI investment bubble.