ChatGPT hit with privacy complaint over defamatory hallucinations
Data poisoning, manipulation & downstream harms
- Several comments frame indiscriminate web scraping and non-auditable training as “corporate recklessness,” not innocent use of public data.
- Hypothetical “poison” scenarios are raised: hidden instructions in web documents causing models (or derivative scoring systems) to quietly sabotage individuals in hiring, credit, health, or parole contexts.
- Others note that LLMs are already trained on dubious sources; they see greater risk in deliberate, agenda-driven training or censorship than in isolated poisoned pages.
- Comparisons are made to early “Google bombing” and speculation that hostile actors could flood training data to shift model behavior or even markets.
LLMs vs search engines & traditional publishing
- One side argues LLM risks are akin to misusing Google or unvetted articles: the real issue is how downstream systems rely on them.
- Counterpoints highlight key differences:
- Poison in an LLM is embedded behavior, hard to detect or remove.
- Web pages are static and de-indexable; model weights aren’t.
- LLMs can generate novel, source-less defamatory text.
- Some stress that search engines already honor takedown laws, while LLMs currently lack equivalent, robust mechanisms.
Defamation, liability & disclaimers
- Disagreement over whether generic “may be wrong” disclaimers meaningfully shield companies from defamation or GDPR duties.
- Some think holding providers liable would make LLMs unusable or unavailable in strict jurisdictions; others respond that products which must disown their own outputs are fundamentally defective.
- Analogies are drawn to bath salts sold as “not for human consumption” or chatbots whose lies have already produced legal liability in other sectors.
Mitigations & product changes
- Discussion notes that the specific Norwegian case now yields an answer grounded in web search rather than pure model memory.
- There is skepticism this fully fixes the problem: hallucinations remain possible, the model still struggles to say “I don’t know,” and similar errors may affect other names.
- Proposals include: mandatory web-grounding with citations; blocking outputs involving specific names; or treating AI outputs as publisher content, with corresponding responsibility.
Hallucinations, usefulness & overclaiming
- Some argue hallucination is inherent and unrecoverable for high-stakes uses, implying certain applications (legal, credit, reputational) should be off-limits.
- Others say LLMs are valuable as “idea generators” or assistants when the user already has domain knowledge and can verify; they are dangerous as authoritative information sources.
- Critics emphasize that marketing and UI portray these systems as reliable answer engines, not “daydream machines,” creating a mismatch between design, hype, and legal expectations.
Regulation, rights & GDPR
- Multiple comments point to GDPR’s requirements for accuracy and rights to rectification/erasure of personal data, questioning how that can coexist with opaque, weight-encoded training on PII.
- Some see complaints backed by privacy NGOs as essential pressure to force large vendors to take accountability; others fear new liabilities will chill open-source AI and expand surveillance or censorship.
- There is a recurring tension between wanting strong remedies for individuals defamed by models and concern that overbroad rules could effectively ban or severely limit LLM deployment in certain regions.