OpenAI may not use lyrics without license, German court rules

Scope of the ruling & liability debate

  • Discussion centers on a German court finding OpenAI liable when its models reproduce song lyrics, rejecting OpenAI’s argument that only users prompting the model should be responsible.
  • Key legal hinge: the court treats LLM weights as containing (lossy) copies of training data; verbatim or near‑verbatim lyrics in output = stored and redistributed copies.
  • Some see this as consistent with long‑standing copyright rules (memorizing then writing out a song is still infringement); others think it stretches “copy” and “memorization” to absurdity.

AI vs humans, tools, and platforms

  • Analogies debated:
    • Secretary reading lyrics to a boss; artist drawing Mickey Mouse on commission; Word vs ChatGPT; piracy streaming sites; YouTube/Google search previews.
  • One camp: if it would be legal for humans doing this at scale under corporate direction, it should be legal for AI; if not, AI shouldn’t get a special pass.
  • Others stress scale and automation: a commercial service that can systematically output protected works is closer to a lyrics database or piracy host than to a private human memory.

Impact on OpenAI, market, and EU

  • Some expect OpenAI to geo‑restrict Germany/EU or rely on VPN leakage; others argue 80M+ Germans (and the whole EU single market) are too big to abandon, so OpenAI will either filter lyrics harder or license them.
  • There’s debate on whether this sets a broader precedent for all copyrighted text, including code and books, and whether open‑weight models capable of regurgitation could be banned or chilled.

GEMA, licensing, and gatekeeping

  • Mixed views on GEMA and similar collecting societies: seen both as essential for rightsholders and as rent‑seeking, zero‑sum, and hostile to innovation (past YouTube blocks are cited).
  • Some predict a “pay them off” settlement and expansion of flat fees or “AI levies” on subscriptions; concern that large players will afford licenses and smaller startups will be locked out.

Artists, AI slop, and incentives

  • Worries that ubiquitous low‑effort “AI slop” (including lyrics commentary sites) degrades the web, disincentivizes original creation, and centralizes cultural wealth with AI platforms.
  • Others argue people will create art regardless, but fear that AI will capture most of the economic upside, pushing human‑made work into a niche “premium” category.

Copyright, innovation, and fairness

  • Strong split:
    • One side views strict copyright (DMCA, lyrics control) as stifling a major technological breakthrough; suggests weakening or abolishing parts of IP law.
    • The other emphasizes asymmetry: individuals are punished for piracy while large AI firms mass‑ingest copyrighted works, lock models behind paywalls, and externalize legal risk to users.
  • Some propose a clearer framework: training on copyrighted data allowed only with licensing and/or when outputs and models themselves remain open; otherwise, creators deserve compensation.

Technical responses and feasibility

  • Commenters note AI companies already attempt to block lyrics/news reproduction via prompts and filters, but jailbreaks remain easy.
  • Ideas raised: deduplicating repeated sequences in training data; removing specific lyrics post‑hoc; or shifting to architectures that rely on external (licensed) retrieval for facts/lyrics.
  • Others doubt such filtering can ever be watertight, implying ongoing legal friction between generative models and copyright regimes.