Thomson Reuters wins first major AI copyright case in the US
What the case was actually about
- Ross Intelligence built a legal search tool meant to substitute for Westlaw.
- They used Westlaw headnotes and key-number indexing as training data, via human “translations” of those headnotes.
- The court held that Westlaw’s headnotes are individually copyrightable works, not mere uncopyrightable facts or raw case law.
- Ross’s system was essentially a semantic search engine over those headnotes (non‑generative), returning case opinions, not new text.
Fair use analysis and copyrightability
- The judge emphasized purpose and market effect: Ross meant to create a cheaper market substitute using Westlaw’s value‑add annotations.
- That intent and commercial competition weighed heavily against fair use, even though end users did not see verbatim headnotes.
- The opinion analogizes headnote selection to sculpture: choosing which parts of an opinion to quote or summarize is a creative act.
- Some commenters think extending copyright to “selection” of quotes is overbroad and likely vulnerable on appeal; others say it fits existing doctrine that creativity, not “sweat of the brow,” is what matters.
Implications for AI and LLM training
- One camp sees this as a narrow ruling about copying proprietary summaries to build a directly competing search service, not about broad LLM training.
- Another camp thinks the fair‑use reasoning (non‑transformative, market‑substituting use of copyrighted inputs) is a worrying precedent for generative AI trained on news, books, art, etc.
- Debate splits over whether training itself is infringement or only distribution/outputs matter; there’s no consensus in the thread.
- Some note that if generative AIs exist mainly to be cheaper substitutes for human creators, that undercuts fair‑use arguments.
Power, open source, and future licensing regimes
- Many expect large AI vendors to pivot to licensing major corpora, further entrenching big tech and legacy media, and locking out open‑source and small players.
- Others argue that licensing “everything” is practically impossible; this pressure might force new law (e.g., compulsory licensing/collecting societies for training data).
- Several commenters explicitly prefer strong enforcement, even if it slows or restricts AI, to prevent uncompensated mass appropriation.
Broader concerns and side discussions
- Worries about non‑Western models ignoring copyright and outpacing Western systems.
- Analogies drawn to Google News snippets, phone books, court reporters, and educational fair use.
- Some argue this is “good for humans” and creators; others see it as shifting AI power from startups and open communities to a few well‑capitalized firms.