Anthropic announces proof of distillation at scale by MiniMax, DeepSeek,Moonshot

Scale and Feasibility of Distillation

  • Key data point: ~16M Claude chat sessions (via ~24k accounts) were enough to substantially distill its behavior; commenters see this as a surprisingly low barrier and evidence that Anthropic’s moat is thin.
  • People infer that future “industrial-scale” distillation is practically unavoidable as long as high-end models are exposed via public APIs.
  • Some wonder whether this data volume could train not just alignment/formatting but parts of a base model; numbers (~0.5T tokens if long contexts) make this plausible but unclear.

IP, Scraping, and Hypocrisy

  • Dominant reaction: Anthropic is accused of “living by the sword, dying by the sword” — having trained on scraped/copyrighted human content, then objecting when others scrape/distill their outputs.
  • Many say they feel no sympathy for a lab that benefited from broad, often non-consensual data use and now wants its own outputs treated as protected IP.
  • Several note this tweet will likely be cited in future lawsuits as evidence Anthropic believes unauthorized use of IP meaningfully harms rights-holders.

Competition, Business Models, and “Prisoner’s Dilemma”

  • One camp: distillation threatens incentive to invest hundreds of millions in frontier training, pushing labs to lock down models or seek regulation — a “prisoner’s dilemma” that could slow progress.
  • Countercamp: Chinese and other distilled/open models have already forced US labs to improve faster and lower prices; competition is working, not breaking.
  • Some ask why Anthropic doesn’t release its own distilled open-weight models if it truly cares about broad access.

Geopolitics, Regulation, and National Security Framing

  • Many see the announcement as political messaging aimed at regulators, not customers: tying Chinese distillation to export controls, national security, and bans on “foreign AI.”
  • There’s discussion of emerging US bills to restrict Chinese models for government contractors, and speculation about broader domestic bans.
  • Others note that US labs also rely on scraping and question why Chinese labs should respect US IP when export controls try to hold them back.

Broader Debates: Safety, Environment, and IP Philosophy

  • Some agree that if distillation cuts energy/compute by ~100×, it is ethically preferable to repeated huge training runs.
  • Safety concerns surface around distilled models losing safeguards and being used as agents on the open web; others dismiss this as overblown or solvable via tooling.
  • A long subthread debates whether modern copyright meaningfully serves individual creators versus large corporations, with some arguing for radically weakening or abolishing IP altogether.