OpenAI may not use lyrics without license, German court rules
Scope of the ruling & liability debate
- Discussion centers on a German court finding OpenAI liable when its models reproduce song lyrics, rejecting OpenAI’s argument that only users prompting the model should be responsible.
- Key legal hinge: the court treats LLM weights as containing (lossy) copies of training data; verbatim or near‑verbatim lyrics in output = stored and redistributed copies.
- Some see this as consistent with long‑standing copyright rules (memorizing then writing out a song is still infringement); others think it stretches “copy” and “memorization” to absurdity.
AI vs humans, tools, and platforms
- Analogies debated:
- Secretary reading lyrics to a boss; artist drawing Mickey Mouse on commission; Word vs ChatGPT; piracy streaming sites; YouTube/Google search previews.
- One camp: if it would be legal for humans doing this at scale under corporate direction, it should be legal for AI; if not, AI shouldn’t get a special pass.
- Others stress scale and automation: a commercial service that can systematically output protected works is closer to a lyrics database or piracy host than to a private human memory.
Impact on OpenAI, market, and EU
- Some expect OpenAI to geo‑restrict Germany/EU or rely on VPN leakage; others argue 80M+ Germans (and the whole EU single market) are too big to abandon, so OpenAI will either filter lyrics harder or license them.
- There’s debate on whether this sets a broader precedent for all copyrighted text, including code and books, and whether open‑weight models capable of regurgitation could be banned or chilled.
GEMA, licensing, and gatekeeping
- Mixed views on GEMA and similar collecting societies: seen both as essential for rightsholders and as rent‑seeking, zero‑sum, and hostile to innovation (past YouTube blocks are cited).
- Some predict a “pay them off” settlement and expansion of flat fees or “AI levies” on subscriptions; concern that large players will afford licenses and smaller startups will be locked out.
Artists, AI slop, and incentives
- Worries that ubiquitous low‑effort “AI slop” (including lyrics commentary sites) degrades the web, disincentivizes original creation, and centralizes cultural wealth with AI platforms.
- Others argue people will create art regardless, but fear that AI will capture most of the economic upside, pushing human‑made work into a niche “premium” category.
Copyright, innovation, and fairness
- Strong split:
- One side views strict copyright (DMCA, lyrics control) as stifling a major technological breakthrough; suggests weakening or abolishing parts of IP law.
- The other emphasizes asymmetry: individuals are punished for piracy while large AI firms mass‑ingest copyrighted works, lock models behind paywalls, and externalize legal risk to users.
- Some propose a clearer framework: training on copyrighted data allowed only with licensing and/or when outputs and models themselves remain open; otherwise, creators deserve compensation.
Technical responses and feasibility
- Commenters note AI companies already attempt to block lyrics/news reproduction via prompts and filters, but jailbreaks remain easy.
- Ideas raised: deduplicating repeated sequences in training data; removing specific lyrics post‑hoc; or shifting to architectures that rely on external (licensed) retrieval for facts/lyrics.
- Others doubt such filtering can ever be watertight, implying ongoing legal friction between generative models and copyright regimes.