AI and the Ship of Theseus
LLMs, GPL, and Derivative Works
- Major debate over whether LLM‑generated code trained on GPL or unlicensed (“all rights reserved”) code should itself be considered GPL or otherwise restricted.
- Some argue any significant capability derived from GPL code should “infect” all outputs; others counter that learning general patterns (like a human) does not create derivative works.
- Edge cases raised: tiny snippets (e.g., common boilerplate), mixing many small GPL fragments, and test suites as potential sources of derivation.
Copyright Status of AI-Generated Code
- Several comments assert that in the US, purely machine‑generated code cannot be copyrighted, hence cannot be licensed.
- Others push back, citing Copyright Office guidance: human-directed use of AI can still produce copyrightable work if human creativity is substantial.
- Unclear how much prompt control or post‑editing is needed for code to qualify; consensus is that future court cases will be messy.
Ethical Views on Training and Reimplementation
- Some describe current LLM practice as “industry built on theft”: models trained on copyrighted and copyleft code without permission, attribution, or compensation.
- Others focus on practicality: large capital backing means these models will not be rolled back, regardless of ethics.
- Moral questions about using AI to relicense or “slopfork” long‑maintained projects; some see it as deeply unethical, others as legitimate modernization.
Impact on Open Source Licensing and Strategy
- One camp predicts licenses effectively collapse to: very permissive (MIT/BSD) vs fully closed, since AI+reverse‑engineering can clone most software.
- Others argue GPL still matters: historically forced corporate contributions and protected user freedoms; permissive licenses risk one‑way extraction.
- Some note LGPL’s intent to protect users (swappability, tinkering) and see attempts to bypass it as undermining those protections.
Reverse Engineering and Reimplementation Costs
- Multiple examples of LLMs reimplementing libraries or protocols from tests, specs, network traces, or binaries, often with improved performance.
- View that copyleft enforcement relied on reimplementation being costly; if AI makes clean‑room rewrites cheap, that economic basis erodes.
Proposed New Licensing Responses
- Suggestions for AI‑oriented copyleft (“AIGPL”): if you train on a work or feed it as input, model weights and outputs must inherit the license.
- Others note licenses cannot unilaterally redefine “derivative work,” but could still impose contractual conditions on training or use.
- Some predict tests and specs, not implementations, may become the primary proprietary asset.
Broader IP and Morality Debates
- Thread revisits whether “intellectual property” is conceptually sound, with critiques around patent trolling, corporate power, and software patents.
- Disagreement over whether dismantling IP would kill R&D (e.g., in medicine) or could be replaced by public funding and focus on physical goods.
- Several comments criticize sidelining morality in favor of legalistic or technical arguments, seeing it as symptomatic of wider societal issues.