2025-02-11

Thomson Reuters wins first major AI copyright case in the US

What the case was actually about

Ross Intelligence built a legal search tool meant to substitute for Westlaw.
They used Westlaw headnotes and key-number indexing as training data, via human “translations” of those headnotes.
The court held that Westlaw’s headnotes are individually copyrightable works, not mere uncopyrightable facts or raw case law.
Ross’s system was essentially a semantic search engine over those headnotes (non‑generative), returning case opinions, not new text.

Fair use analysis and copyrightability

The judge emphasized purpose and market effect: Ross meant to create a cheaper market substitute using Westlaw’s value‑add annotations.
That intent and commercial competition weighed heavily against fair use, even though end users did not see verbatim headnotes.
The opinion analogizes headnote selection to sculpture: choosing which parts of an opinion to quote or summarize is a creative act.
Some commenters think extending copyright to “selection” of quotes is overbroad and likely vulnerable on appeal; others say it fits existing doctrine that creativity, not “sweat of the brow,” is what matters.

Implications for AI and LLM training

One camp sees this as a narrow ruling about copying proprietary summaries to build a directly competing search service, not about broad LLM training.
Another camp thinks the fair‑use reasoning (non‑transformative, market‑substituting use of copyrighted inputs) is a worrying precedent for generative AI trained on news, books, art, etc.
Debate splits over whether training itself is infringement or only distribution/outputs matter; there’s no consensus in the thread.
Some note that if generative AIs exist mainly to be cheaper substitutes for human creators, that undercuts fair‑use arguments.

Power, open source, and future licensing regimes

Many expect large AI vendors to pivot to licensing major corpora, further entrenching big tech and legacy media, and locking out open‑source and small players.
Others argue that licensing “everything” is practically impossible; this pressure might force new law (e.g., compulsory licensing/collecting societies for training data).
Several commenters explicitly prefer strong enforcement, even if it slows or restricts AI, to prevent uncompensated mass appropriation.

Broader concerns and side discussions

Worries about non‑Western models ignoring copyright and outpacing Western systems.
Analogies drawn to Google News snippets, phone books, court reporters, and educational fair use.
Some argue this is “good for humans” and creators; others see it as shifting AI power from startups and open communities to a few well‑capitalized firms.

Related topics