Judge dismisses DMCA copyright claim in GitHub Copilot suit
Case outcome and legal reasoning
- Judge dismissed the DMCA §1202(b) claim because plaintiffs did not show Copilot outputting their code identically with copyright/attribution stripped, which §1202 requires.
- Commenters note the ruling is narrow: about DMCA “copyright management information,” not all copyright issues.
- Some think plaintiffs’ strategy was weak: they alleged verbatim copying but couldn’t produce a single accepted example from their own code.
Evidence and “identicality”
- People recall public demos of Copilot reproducing famous snippets (e.g., Quake fast inverse sqrt) or NYT text, but note:
- Those rights-holders weren’t plaintiffs here.
- Courts require evidence tied to plaintiffs’ works, not “in theory this happens.”
- GitHub reportedly added a “copyright filter”; debate on whether that’s prudence or “destroying evidence.” Others note old versions still exist and can be subpoenaed.
Training on copyrighted code and fair use
- One side: training on public code (even GPL, art, prose) is non‑infringing “learning”; function and style aren’t protected, only expression.
- Other side: training creates a derivative commercial product built on copyrighted works without consent or compensation; fair use was never meant for mass AI training.
- Dispute over whether model weights are a derivative work and whether paraphrased output can still infringe or violate licenses (e.g., GPL conditions, attribution).
Ethics, scale, and impact on creators
- Critics see AI training on non‑consenting artists’ and coders’ work as “pure exploitation,” especially when it displaces their income.
- Defenders argue automation has always displaced labor; the economic problem is distribution, not the tool.
- Scale and lack of accountability of machine agents are recurring concerns.
Licensing, GitHub, and OSS reactions
- Debate on whether GitHub’s ToS gives it rights to use code for Copilot; some quote language that seems limited to “providing the service.”
- Edge cases: code uploaded by non‑authors; GPL projects mirrored on GitHub; authors whose code was uploaded by others.
- Some propose “no-AI” or anti-training licenses; others note if training is ruled fair use, such clauses may be ineffective and are not FOSS.
- A few developers say they’ll stop publishing open source or avoid GitHub; others think the OSS ecosystem will largely continue.
Technical behavior of LLMs
- Discussion of memorization vs abstraction: models usually compress patterns, but can “recite” training data in some prompts.
- Filters that avoid verbatim output don’t prevent close paraphrases, which may still raise legal and ethical questions.