AI assistants misrepresent news content 45% of the time
Human vs AI accuracy
- Many argue the 45% error rate is meaningless without a human baseline: both average readers and journalists frequently misrepresent science, politics, and technical topics (“Gell‑Mann amnesia” is cited).
- Others counter that this is not an excuse: AI is downstream of human news, so it amplifies existing errors with additional hallucinations, making a “stochastic telephone” chain.
- Some speculate AI summarization might still outperform low‑quality journalism or wire‑rewrite pieces, but this is described as unclear and unmeasured.
Methodology and metrics
- Several commenters think the study is weakly designed: ~30 “core” questions, free/consumer models (GPT‑4o, Gemini 2.5 Flash, free Copilot, free Perplexity), and no comparison to state‑of‑the‑art paid models.
- “Errors” are often sourcing issues (missing/incorrect citations, Wikipedia overuse, outdated articles) rather than outright fabricated facts, which some see as nitpicky.
- Others point out concrete, serious failures: hallucinated Wikipedia pages, non‑existent URLs, invented policies, and outdated geopolitical facts.
Experiences with AI summaries
- Positive reports: AI note‑takers and meeting summarizers (Copilot, others) are often judged “good enough” and sometimes better than human notes, provided humans proofread.
- Negative reports: Gemini and Perplexity hallucinating entire news items, links, and citations; call and email summaries that invert key decisions or add imaginary agreements; media monitoring that’s unusable.
- Some tools (e.g., Kagi News, custom RAG setups) are seen as more reliable when constrained to specific articles and verifiable sources.
Media ecosystem and incentives
- A recurring theme is that traditional news is already highly biased, narrative‑driven, and often wrong; AI is seen either as a further degradation of “slop” or as a potential disruptor of bad journalism.
- Commenters note BBC and other public broadcasters have self‑interest in emphasizing AI’s flaws, especially while restricting crawlers and litigating against AI companies.
Risks, responsibility, and mitigation
- Concerns include people outsourcing critical thinking, gaining “anti‑knowledge,” and having confirmation bias supercharged by plausible‑sounding AI outputs.
- Some argue human vs AI comparison is secondary: because AI can scale to billions of interactions, its standalone error rate must be extremely low.
- Proposed mitigations: strict grounding and tool use (live web checks), explicit source verification, better user education on failure modes, and higher methodological standards in evaluating AI.