2026-06-09

Claude Fable 5

Model, positioning, and benchmarks

Fable 5 and Mythos 5 share weights; Fable is the “safety‑nerfed” public version, Mythos is gated to “trusted” partners and US government programs.
Reported benchmark gains are large on coding and agentic tasks (SWE‑Bench Pro, FrontierCode, OSWorld, “Humanity’s Last Exam”), but often single‑digit or low double‑digit percentage points over Opus 4.8 and GPT‑5.5.
Some see this as a genuine step‑change, especially for long‑horizon/agentic work; others call it an incremental iteration over‑marketed as a revolution.

Safeguards, refusal rates, and silent nerfs

Fable routes certain queries (cybersecurity, biology, chemistry) to Opus 4.8; in practice, many users report triggers on benign coding, GPU drivers, genetics, basic biology, finance, and even generic math/health queries.
Several complain that long agentic runs can be interrupted mid‑task by a safety trigger, degrading usefulness and wasting tokens.
Anthropic also says it silently limits effectiveness on “frontier LLM development” (pretraining pipelines, accelerator design, etc.) via hidden steering/fine‑tuning, without user-visible fallbacks; many see this as protecting Anthropic’s moat rather than public safety.

Pricing, usage, and subscriptions

API pricing is 2× Opus 4.8 ($10/M input, $50/M output); some report single sessions burning tens of dollars or large chunks of weekly quotas within minutes.
Fable 5 is temporarily included in Pro/Max/Team until a fixed date, then becomes usage‑only; many interpret this as a “free trial → push to pay‑as‑you‑go” and fear a gradual hollowing‑out of flat‑rate plans.
Some argue costs are justified by productivity gains; others say in real workflows token burn, slowdowns, and hallucinations erode that value and push them toward cheaper/open models.

Data retention and privacy

Mythos‑class models require 30‑day data retention on all surfaces, including through third‑party platforms and even where “zero data retention” was previously guaranteed.
This is framed as necessary for safety monitoring and jailbreak detection; critics see it as a major regression for compliance (HIPAA, enterprise policies) and government access risk.

Impact on developers, security, and competition

Early testers report strong improvements in large‑codebase refactors, incident response, and UI design; others find Fable slower, loopier, or no better than Opus 4.6–4.8 or Chinese/open models.
Security people worry: Mythos reportedly finds many real vulnerabilities, but public access is heavily restricted; many expect criminals to replicate such capabilities via distillation or open models anyway.
Thread debates open vs closed: some think chip controls and distillation defenses will keep open models 1–2 years behind; others point to DeepSeek/Qwen pricing as proof that frontier‑like capability will escape quickly.

Related topics