ChatGPT hit with privacy complaint over defamatory hallucinations

Data poisoning, manipulation & downstream harms

  • Several comments frame indiscriminate web scraping and non-auditable training as “corporate recklessness,” not innocent use of public data.
  • Hypothetical “poison” scenarios are raised: hidden instructions in web documents causing models (or derivative scoring systems) to quietly sabotage individuals in hiring, credit, health, or parole contexts.
  • Others note that LLMs are already trained on dubious sources; they see greater risk in deliberate, agenda-driven training or censorship than in isolated poisoned pages.
  • Comparisons are made to early “Google bombing” and speculation that hostile actors could flood training data to shift model behavior or even markets.

LLMs vs search engines & traditional publishing

  • One side argues LLM risks are akin to misusing Google or unvetted articles: the real issue is how downstream systems rely on them.
  • Counterpoints highlight key differences:
    • Poison in an LLM is embedded behavior, hard to detect or remove.
    • Web pages are static and de-indexable; model weights aren’t.
    • LLMs can generate novel, source-less defamatory text.
  • Some stress that search engines already honor takedown laws, while LLMs currently lack equivalent, robust mechanisms.

Defamation, liability & disclaimers

  • Disagreement over whether generic “may be wrong” disclaimers meaningfully shield companies from defamation or GDPR duties.
  • Some think holding providers liable would make LLMs unusable or unavailable in strict jurisdictions; others respond that products which must disown their own outputs are fundamentally defective.
  • Analogies are drawn to bath salts sold as “not for human consumption” or chatbots whose lies have already produced legal liability in other sectors.

Mitigations & product changes

  • Discussion notes that the specific Norwegian case now yields an answer grounded in web search rather than pure model memory.
  • There is skepticism this fully fixes the problem: hallucinations remain possible, the model still struggles to say “I don’t know,” and similar errors may affect other names.
  • Proposals include: mandatory web-grounding with citations; blocking outputs involving specific names; or treating AI outputs as publisher content, with corresponding responsibility.

Hallucinations, usefulness & overclaiming

  • Some argue hallucination is inherent and unrecoverable for high-stakes uses, implying certain applications (legal, credit, reputational) should be off-limits.
  • Others say LLMs are valuable as “idea generators” or assistants when the user already has domain knowledge and can verify; they are dangerous as authoritative information sources.
  • Critics emphasize that marketing and UI portray these systems as reliable answer engines, not “daydream machines,” creating a mismatch between design, hype, and legal expectations.

Regulation, rights & GDPR

  • Multiple comments point to GDPR’s requirements for accuracy and rights to rectification/erasure of personal data, questioning how that can coexist with opaque, weight-encoded training on PII.
  • Some see complaints backed by privacy NGOs as essential pressure to force large vendors to take accountability; others fear new liabilities will chill open-source AI and expand surveillance or censorship.
  • There is a recurring tension between wanting strong remedies for individuals defamed by models and concern that overbroad rules could effectively ban or severely limit LLM deployment in certain regions.