Anthropic announces proof of distillation at scale by MiniMax, DeepSeek,Moonshot
Scale and Feasibility of Distillation
- Key data point: ~16M Claude chat sessions (via ~24k accounts) were enough to substantially distill its behavior; commenters see this as a surprisingly low barrier and evidence that Anthropic’s moat is thin.
- People infer that future “industrial-scale” distillation is practically unavoidable as long as high-end models are exposed via public APIs.
- Some wonder whether this data volume could train not just alignment/formatting but parts of a base model; numbers (~0.5T tokens if long contexts) make this plausible but unclear.
IP, Scraping, and Hypocrisy
- Dominant reaction: Anthropic is accused of “living by the sword, dying by the sword” — having trained on scraped/copyrighted human content, then objecting when others scrape/distill their outputs.
- Many say they feel no sympathy for a lab that benefited from broad, often non-consensual data use and now wants its own outputs treated as protected IP.
- Several note this tweet will likely be cited in future lawsuits as evidence Anthropic believes unauthorized use of IP meaningfully harms rights-holders.
Competition, Business Models, and “Prisoner’s Dilemma”
- One camp: distillation threatens incentive to invest hundreds of millions in frontier training, pushing labs to lock down models or seek regulation — a “prisoner’s dilemma” that could slow progress.
- Countercamp: Chinese and other distilled/open models have already forced US labs to improve faster and lower prices; competition is working, not breaking.
- Some ask why Anthropic doesn’t release its own distilled open-weight models if it truly cares about broad access.
Geopolitics, Regulation, and National Security Framing
- Many see the announcement as political messaging aimed at regulators, not customers: tying Chinese distillation to export controls, national security, and bans on “foreign AI.”
- There’s discussion of emerging US bills to restrict Chinese models for government contractors, and speculation about broader domestic bans.
- Others note that US labs also rely on scraping and question why Chinese labs should respect US IP when export controls try to hold them back.
Broader Debates: Safety, Environment, and IP Philosophy
- Some agree that if distillation cuts energy/compute by ~100×, it is ethically preferable to repeated huge training runs.
- Safety concerns surface around distilled models losing safeguards and being used as agents on the open web; others dismiss this as overblown or solvable via tooling.
- A long subthread debates whether modern copyright meaningfully serves individual creators versus large corporations, with some arguing for radically weakening or abolishing IP altogether.