2024-05-25

Google scrambles to manually remove weird AI answers in search

Product quality and manual cleanup

Many see it as ironic and damning that Google is “manually removing” bad AI answers; it feels like human-powered patching of a system sold as autonomous.
Thread consensus: AI Overviews are “pre‑alpha trash,” worse than traditional search, and now mostly restricted or disabled for many users.
Several note that Google has a history of flashy AI launches (e.g., Gemini image issues) followed by public embarrassments and reactive fixes in response to publicity, not systematic testing.

Training data, hallucinations, and summaries

Strong criticism of training or grounding on Reddit, Twitter, and other noisy UGC: full of jokes, sarcasm, astroturfing, and low-effort content.
Some emphasize that the pizza-glue and rock-eating answers are not classic “hallucinations” but faithful summaries of joke posts; the model can’t reliably detect satire or bad-faith content.
Others argue LLMs are always hallucinating in a technical sense: they generate plausible text, not truth, and there is no internal mechanism to distinguish fact from fabrication.

Accuracy, safety, and responsibility

Debate over acceptable error rates: 80–90% “correct” is seen as unacceptable for search, especially for health or safety queries.
Many call for models to say “I don’t know” and expose confidence, or to refuse high‑risk questions.
Disagreement over Google’s responsibility: some say it’s unreasonable to make Google protect “every mentally ill or naive user,” others argue dangerous guidance (e.g., cooking, medical, self-harm) crosses into negligence or libel.

Search quality and AI as answer engine

Users complain core Google search and YouTube search have degraded for years (spam, SEO sludge); building AI on this corpus is seen as “garbage in, garbage out.”
Several argue Google mistakenly turned a search engine into a single-answer oracle, shifting trust from sources to Google itself and taking on “arbiter of truth” risk.
Some want clear separation: classic search for research vs. optional AI Q&A, with transparency and an off switch.

Organization, incentives, and AI hype

Commenters blame Wall Street and leadership panic over “AI wars” for rushing half-baked features, comparing Google to Boeing’s recent trajectory.
Cultural critiques: leetcode-heavy hiring, weak QA, leadership detached from technical reality, and an ad-driven business model that resists costly curation.
Views on AI overall are mixed: some find tools like GPT‑4/Gemini genuinely useful in narrow domains; others predict an AI bubble driven by hype, poor reasoning ability, and mounting user disillusionment.

Related topics