AI is just unauthorised plagiarism at a bigger scale

Framing: Is AI “just plagiarism”?

  • Many see LLMs as industrialized, unauthorized reuse of others’ work, especially when outputs closely track specific tutorials, articles, or code.
  • Others argue that humans “plagiarize” in a loose sense too (building on prior work), and that AI learning from text is analogous to human learning, not inherently theft.
  • A recurring distinction: using ideas vs. reproducing expression. People accept influence and remixing, but object to near‑verbatim reuse without credit or consent.

Scale, automation, and qualitative change

  • Several comments stress that scale changes the nature of the problem: what’s tolerable or negligible for individuals becomes harmful when automated for billions of documents.
  • AI makes low‑effort rewriting and SEO gaming trivial, flooding the web with derivative content and crowding out originals.

Law, fair use, and copyright disputes

  • Ongoing lawsuits against major AI companies are cited; legal status of training on copyrighted data is described as unsettled.
  • Some argue training is transformative “learning” and should be fair use; others say copying for training is still copying, and memorization/recall shows it’s effectively lossy compression of protected works.
  • There’s debate about whether robots.txt and site terms should be legally binding for training, and about registering copyrights to enable statutory damages.

Economic & labor impacts

  • Concern that AI concentrates value: public content is scraped for free, then monetized as a paid API, undermining incentives for writers, artists, and coders.
  • Fears that this accelerates wage pressure, job loss, and further “UBI‑style” precarity; others report huge personal productivity gains and “more stuff shipped.”

Web scraping, infrastructure, and SEO

  • Heavy, sometimes non‑compliant crawlers from AI firms reportedly cause costs, load, even DoS‑like traffic to sites with no direct benefit.
  • Creators report clones outranking originals in search, sometimes seemingly aided by LLM‑assisted rewriting; some say this predates AI but is now easier and faster.

Open source, IP skepticism, and “information wants to be free”

  • A strong current argues copyright is over‑extended or broken; some welcome AI as de‑facto destruction of IP monopolies.
  • Others counter that IP (including copyleft) is essential for sustainable art, software, and science, and that abolishing it would push everything back into patronage and secrecy.

Proposed responses and regulation

  • Ideas include: mandatory licensing or micropayments for training data; collective/model commons ownership; forcing disclosure of training sets; legal teeth for robots.txt; or an “AI tax” to fund creators and public goods.
  • Skeptics doubt enforceability, worry about entrenching existing monopolies, or about disadvantaging jurisdictions that regulate when others don’t.

Meta-discussion

  • Some see the debate on sites like HN as skewed by corporate or “AI‑bro” interests; others say opposition is mostly about threatened wages.
  • There’s broad agreement that LLMs are powerful and disruptive, but deep disagreement over whether they are net liberation, net enclosure, or both.