Claude Fable 5
Model, positioning, and benchmarks
- Fable 5 and Mythos 5 share weights; Fable is the “safety‑nerfed” public version, Mythos is gated to “trusted” partners and US government programs.
- Reported benchmark gains are large on coding and agentic tasks (SWE‑Bench Pro, FrontierCode, OSWorld, “Humanity’s Last Exam”), but often single‑digit or low double‑digit percentage points over Opus 4.8 and GPT‑5.5.
- Some see this as a genuine step‑change, especially for long‑horizon/agentic work; others call it an incremental iteration over‑marketed as a revolution.
Safeguards, refusal rates, and silent nerfs
- Fable routes certain queries (cybersecurity, biology, chemistry) to Opus 4.8; in practice, many users report triggers on benign coding, GPU drivers, genetics, basic biology, finance, and even generic math/health queries.
- Several complain that long agentic runs can be interrupted mid‑task by a safety trigger, degrading usefulness and wasting tokens.
- Anthropic also says it silently limits effectiveness on “frontier LLM development” (pretraining pipelines, accelerator design, etc.) via hidden steering/fine‑tuning, without user-visible fallbacks; many see this as protecting Anthropic’s moat rather than public safety.
Pricing, usage, and subscriptions
- API pricing is 2× Opus 4.8 ($10/M input, $50/M output); some report single sessions burning tens of dollars or large chunks of weekly quotas within minutes.
- Fable 5 is temporarily included in Pro/Max/Team until a fixed date, then becomes usage‑only; many interpret this as a “free trial → push to pay‑as‑you‑go” and fear a gradual hollowing‑out of flat‑rate plans.
- Some argue costs are justified by productivity gains; others say in real workflows token burn, slowdowns, and hallucinations erode that value and push them toward cheaper/open models.
Data retention and privacy
- Mythos‑class models require 30‑day data retention on all surfaces, including through third‑party platforms and even where “zero data retention” was previously guaranteed.
- This is framed as necessary for safety monitoring and jailbreak detection; critics see it as a major regression for compliance (HIPAA, enterprise policies) and government access risk.
Impact on developers, security, and competition
- Early testers report strong improvements in large‑codebase refactors, incident response, and UI design; others find Fable slower, loopier, or no better than Opus 4.6–4.8 or Chinese/open models.
- Security people worry: Mythos reportedly finds many real vulnerabilities, but public access is heavily restricted; many expect criminals to replicate such capabilities via distillation or open models anyway.
- Thread debates open vs closed: some think chip controls and distillation defenses will keep open models 1–2 years behind; others point to DeepSeek/Qwen pricing as proof that frontier‑like capability will escape quickly.