AI is just unauthorised plagiarism at a bigger scale
Framing: Is AI “just plagiarism”?
- Many see LLMs as industrialized, unauthorized reuse of others’ work, especially when outputs closely track specific tutorials, articles, or code.
- Others argue that humans “plagiarize” in a loose sense too (building on prior work), and that AI learning from text is analogous to human learning, not inherently theft.
- A recurring distinction: using ideas vs. reproducing expression. People accept influence and remixing, but object to near‑verbatim reuse without credit or consent.
Scale, automation, and qualitative change
- Several comments stress that scale changes the nature of the problem: what’s tolerable or negligible for individuals becomes harmful when automated for billions of documents.
- AI makes low‑effort rewriting and SEO gaming trivial, flooding the web with derivative content and crowding out originals.
Law, fair use, and copyright disputes
- Ongoing lawsuits against major AI companies are cited; legal status of training on copyrighted data is described as unsettled.
- Some argue training is transformative “learning” and should be fair use; others say copying for training is still copying, and memorization/recall shows it’s effectively lossy compression of protected works.
- There’s debate about whether robots.txt and site terms should be legally binding for training, and about registering copyrights to enable statutory damages.
Economic & labor impacts
- Concern that AI concentrates value: public content is scraped for free, then monetized as a paid API, undermining incentives for writers, artists, and coders.
- Fears that this accelerates wage pressure, job loss, and further “UBI‑style” precarity; others report huge personal productivity gains and “more stuff shipped.”
Web scraping, infrastructure, and SEO
- Heavy, sometimes non‑compliant crawlers from AI firms reportedly cause costs, load, even DoS‑like traffic to sites with no direct benefit.
- Creators report clones outranking originals in search, sometimes seemingly aided by LLM‑assisted rewriting; some say this predates AI but is now easier and faster.
Open source, IP skepticism, and “information wants to be free”
- A strong current argues copyright is over‑extended or broken; some welcome AI as de‑facto destruction of IP monopolies.
- Others counter that IP (including copyleft) is essential for sustainable art, software, and science, and that abolishing it would push everything back into patronage and secrecy.
Proposed responses and regulation
- Ideas include: mandatory licensing or micropayments for training data; collective/model commons ownership; forcing disclosure of training sets; legal teeth for robots.txt; or an “AI tax” to fund creators and public goods.
- Skeptics doubt enforceability, worry about entrenching existing monopolies, or about disadvantaging jurisdictions that regulate when others don’t.
Meta-discussion
- Some see the debate on sites like HN as skewed by corporate or “AI‑bro” interests; others say opposition is mostly about threatened wages.
- There’s broad agreement that LLMs are powerful and disruptive, but deep disagreement over whether they are net liberation, net enclosure, or both.