Is legal the same as legitimate: AI reimplementation and the erosion of copyleft

Legal vs moral legitimacy

  • Many distinguish sharply between “legal” and “legitimate.”
  • Law only sets what won’t be punished; it doesn’t guarantee ethical acceptability (e.g., tax avoidance, drug price hikes).
  • Some argue criticizing AI relicensing inherently involves “blasting” others for ignoring ethics; others say moral debate that attacks bystanders is itself unethical.

Copyleft’s goals and AI “license washing”

  • Copyleft is seen as using copyright to secure user freedoms (run, study, modify, redistribute) and keep improvements in the commons, not as destroying copyright outright.
  • AI-assisted rewrites of GPL/LGPL code to permissive or proprietary licenses are viewed by many as breaking the social compact (“share back under same terms”) even if they’re arguably legal.
  • Others reply that reimplementation from behavior/specs has long been used to free proprietary software; if that was celebrated, it’s inconsistent to condemn the reverse.

Clean-room, APIs, tests, and derivative works

  • Big debate over whether the chardet rewrite is a genuine “clean-room” implementation.
    • Points against: maintainer deeply knew the old code; LLM likely trained on it; design doc allegedly had the agent download and reference original files; tests are themselves LGPL “source.”
    • Points for: reported low textual similarity; only API + tests used at generation time; functionality, not code, was copied.
  • Disagreement on whether using a GPL test suite to drive a rewrite makes the result a “work based on the library.”
  • Google v. Oracle is repeatedly invoked: APIs are copyrightable but reimplementation can be fair use; some say tests exercise functionality, not expression.

LLM training, fair use, and copyrightability of output

  • Several note recent US decisions:
    • Training on books has been called fair use in some cases.
    • AI-generated works without meaningful human authorship can’t be copyrighted.
  • That leads to conflicting implications:
    • If LLM output isn’t copyrightable, AI rewrites might be de facto public domain, making relicensing void.
    • If humans “edit enough” they may claim authorship, but then prior exposure to GPL code may make the result derivative.
  • Others argue training itself is massive infringement and that current “transformative use” reasoning is a stretch.

Impact on open source, copyleft, and incentives

  • Many fear AI makes copyleft unenforceable in practice: any well‑specified project can be cheaply relicensed via AI, eroding GPL/AGPL and especially “source available” models like SSPL.
  • This could:
    • Discourage releasing source at all (shift to closed or SaaS).
    • Undermine commercial open source and dual licensing.
    • Turn OSS into a “free IP mine” for large AI companies.
  • Some say the incentive loss extends beyond copyleft: if code can always be cloned from behavior, even permissive authors and proprietary vendors lose defensible moats.

Power, centralization, and future of IP

  • LLMs are seen as reinforcing corporate power: frontier models and inference remain capital‑intensive; most people can’t run “good enough” models locally.
  • Others note improving open‑weight models and foresee local agents eventually matching current frontier quality.
  • Several participants argue IP law already disproportionately favors large firms; AI makes this starker by enclosing public knowledge into proprietary models.
  • Views diverge on solutions:
    • Tighten IP (e.g., protect specs/tests, ban AI relicensing in new licenses).
    • Shorten or roll back copyright terms.
    • Treat model outputs as public domain and politically attack IP monopolies rather than copyleft.