2026-05-21

AI is just unauthorised plagiarism at a bigger scale

Framing: Is AI “just plagiarism”?

Many see LLMs as industrialized, unauthorized reuse of others’ work, especially when outputs closely track specific tutorials, articles, or code.
Others argue that humans “plagiarize” in a loose sense too (building on prior work), and that AI learning from text is analogous to human learning, not inherently theft.
A recurring distinction: using ideas vs. reproducing expression. People accept influence and remixing, but object to near‑verbatim reuse without credit or consent.

Scale, automation, and qualitative change

Several comments stress that scale changes the nature of the problem: what’s tolerable or negligible for individuals becomes harmful when automated for billions of documents.
AI makes low‑effort rewriting and SEO gaming trivial, flooding the web with derivative content and crowding out originals.

Law, fair use, and copyright disputes

Ongoing lawsuits against major AI companies are cited; legal status of training on copyrighted data is described as unsettled.
Some argue training is transformative “learning” and should be fair use; others say copying for training is still copying, and memorization/recall shows it’s effectively lossy compression of protected works.
There’s debate about whether robots.txt and site terms should be legally binding for training, and about registering copyrights to enable statutory damages.

Economic & labor impacts

Concern that AI concentrates value: public content is scraped for free, then monetized as a paid API, undermining incentives for writers, artists, and coders.
Fears that this accelerates wage pressure, job loss, and further “UBI‑style” precarity; others report huge personal productivity gains and “more stuff shipped.”

Web scraping, infrastructure, and SEO

Heavy, sometimes non‑compliant crawlers from AI firms reportedly cause costs, load, even DoS‑like traffic to sites with no direct benefit.
Creators report clones outranking originals in search, sometimes seemingly aided by LLM‑assisted rewriting; some say this predates AI but is now easier and faster.

Open source, IP skepticism, and “information wants to be free”

A strong current argues copyright is over‑extended or broken; some welcome AI as de‑facto destruction of IP monopolies.
Others counter that IP (including copyleft) is essential for sustainable art, software, and science, and that abolishing it would push everything back into patronage and secrecy.

Proposed responses and regulation

Ideas include: mandatory licensing or micropayments for training data; collective/model commons ownership; forcing disclosure of training sets; legal teeth for robots.txt; or an “AI tax” to fund creators and public goods.
Skeptics doubt enforceability, worry about entrenching existing monopolies, or about disadvantaging jurisdictions that regulate when others don’t.

Meta-discussion

Some see the debate on sites like HN as skewed by corporate or “AI‑bro” interests; others say opposition is mostly about threatened wages.
There’s broad agreement that LLMs are powerful and disruptive, but deep disagreement over whether they are net liberation, net enclosure, or both.

Related topics