AI and the Ship of Theseus

LLMs, GPL, and Derivative Works

  • Major debate over whether LLM‑generated code trained on GPL or unlicensed (“all rights reserved”) code should itself be considered GPL or otherwise restricted.
  • Some argue any significant capability derived from GPL code should “infect” all outputs; others counter that learning general patterns (like a human) does not create derivative works.
  • Edge cases raised: tiny snippets (e.g., common boilerplate), mixing many small GPL fragments, and test suites as potential sources of derivation.

Copyright Status of AI-Generated Code

  • Several comments assert that in the US, purely machine‑generated code cannot be copyrighted, hence cannot be licensed.
  • Others push back, citing Copyright Office guidance: human-directed use of AI can still produce copyrightable work if human creativity is substantial.
  • Unclear how much prompt control or post‑editing is needed for code to qualify; consensus is that future court cases will be messy.

Ethical Views on Training and Reimplementation

  • Some describe current LLM practice as “industry built on theft”: models trained on copyrighted and copyleft code without permission, attribution, or compensation.
  • Others focus on practicality: large capital backing means these models will not be rolled back, regardless of ethics.
  • Moral questions about using AI to relicense or “slopfork” long‑maintained projects; some see it as deeply unethical, others as legitimate modernization.

Impact on Open Source Licensing and Strategy

  • One camp predicts licenses effectively collapse to: very permissive (MIT/BSD) vs fully closed, since AI+reverse‑engineering can clone most software.
  • Others argue GPL still matters: historically forced corporate contributions and protected user freedoms; permissive licenses risk one‑way extraction.
  • Some note LGPL’s intent to protect users (swappability, tinkering) and see attempts to bypass it as undermining those protections.

Reverse Engineering and Reimplementation Costs

  • Multiple examples of LLMs reimplementing libraries or protocols from tests, specs, network traces, or binaries, often with improved performance.
  • View that copyleft enforcement relied on reimplementation being costly; if AI makes clean‑room rewrites cheap, that economic basis erodes.

Proposed New Licensing Responses

  • Suggestions for AI‑oriented copyleft (“AIGPL”): if you train on a work or feed it as input, model weights and outputs must inherit the license.
  • Others note licenses cannot unilaterally redefine “derivative work,” but could still impose contractual conditions on training or use.
  • Some predict tests and specs, not implementations, may become the primary proprietary asset.

Broader IP and Morality Debates

  • Thread revisits whether “intellectual property” is conceptually sound, with critiques around patent trolling, corporate power, and software patents.
  • Disagreement over whether dismantling IP would kill R&D (e.g., in medicine) or could be replaced by public funding and focus on physical goods.
  • Several comments criticize sidelining morality in favor of legalistic or technical arguments, seeing it as symptomatic of wider societal issues.