Google scrambles to manually remove weird AI answers in search

Product quality and manual cleanup

  • Many see it as ironic and damning that Google is “manually removing” bad AI answers; it feels like human-powered patching of a system sold as autonomous.
  • Thread consensus: AI Overviews are “pre‑alpha trash,” worse than traditional search, and now mostly restricted or disabled for many users.
  • Several note that Google has a history of flashy AI launches (e.g., Gemini image issues) followed by public embarrassments and reactive fixes in response to publicity, not systematic testing.

Training data, hallucinations, and summaries

  • Strong criticism of training or grounding on Reddit, Twitter, and other noisy UGC: full of jokes, sarcasm, astroturfing, and low-effort content.
  • Some emphasize that the pizza-glue and rock-eating answers are not classic “hallucinations” but faithful summaries of joke posts; the model can’t reliably detect satire or bad-faith content.
  • Others argue LLMs are always hallucinating in a technical sense: they generate plausible text, not truth, and there is no internal mechanism to distinguish fact from fabrication.

Accuracy, safety, and responsibility

  • Debate over acceptable error rates: 80–90% “correct” is seen as unacceptable for search, especially for health or safety queries.
  • Many call for models to say “I don’t know” and expose confidence, or to refuse high‑risk questions.
  • Disagreement over Google’s responsibility: some say it’s unreasonable to make Google protect “every mentally ill or naive user,” others argue dangerous guidance (e.g., cooking, medical, self-harm) crosses into negligence or libel.

Search quality and AI as answer engine

  • Users complain core Google search and YouTube search have degraded for years (spam, SEO sludge); building AI on this corpus is seen as “garbage in, garbage out.”
  • Several argue Google mistakenly turned a search engine into a single-answer oracle, shifting trust from sources to Google itself and taking on “arbiter of truth” risk.
  • Some want clear separation: classic search for research vs. optional AI Q&A, with transparency and an off switch.

Organization, incentives, and AI hype

  • Commenters blame Wall Street and leadership panic over “AI wars” for rushing half-baked features, comparing Google to Boeing’s recent trajectory.
  • Cultural critiques: leetcode-heavy hiring, weak QA, leadership detached from technical reality, and an ad-driven business model that resists costly curation.
  • Views on AI overall are mixed: some find tools like GPT‑4/Gemini genuinely useful in narrow domains; others predict an AI bubble driven by hype, poor reasoning ability, and mounting user disillusionment.