If you believe in "Artificial Intelligence", take five minutes to ask it
Limits of LLM Knowledge & Hallucinations
- Many comments agree the article’s core point is valid: LLMs are often wrong on niche, non-verifiable factual questions and will confidently invent details.
- Several note this is unsurprising given finite model size and lossy compression of training data. Expecting encyclopedic, perfectly reliable recall is seen as a category error.
- Some argue that if a system is only safe to use when the answer can be independently checked, its value is limited for lay users who can’t verify.
Intelligence vs Knowledge (and Metacognition)
- Strong debate over whether current LLMs exhibit “intelligence” or are merely large statistical parrots.
- One camp: LLMs predict tokens; they don’t reason, lack metacognition, and don’t “know that they don’t know,” so outputs must be treated like hearsay.
- Another camp: prediction is a form of reasoning; internal representations and emergent “reasoning-style” behavior suggest a primitive world model.
- Multiple examples show models sometimes honestly say “I don’t know,” but skeptics call this surface-level roleplay, not genuine uncertainty.
Appropriate Use Cases & Verification
- Widely reported sweet spots:
- Coding (snippets, refactors, bug hints) where compilers/tests verify output.
- Summarization, extraction, rewriting, translation, itinerary planning, and brainstorming.
- Many only trust LLMs where correctness is self-evident (code runs, script behaves, travel plan is reviewable) and avoid them for unverified summaries, medical notes, or legal/technical reports.
- A repeated warning: the most dangerous zone is ~90% correctness with no easy way to tell which 10% is wrong.
Model Progress, RAG, and Tooling
- Several reproduce the dinosaur-taxonomy question with newer models (or with web/search integration) and get largely correct, nuanced answers.
- This is used to argue the article is already dated and that real value comes from combining LLMs with retrieval (RAG/GraphRAG) and search, not from raw pretrained models alone.
Hype, Marketing, and Societal Risk
- Some are tired of “LLMs hallucinate niche trivia” posts; others say such critiques remain crucial because LLMs are marketed and perceived as general knowledge oracles.
- Comparisons split between “another iPhone moment” and “another crypto bubble”; skepticism focuses on overhyped claims, regulatory capture, and use in surveillance and automation.
- Epistemology and human parallels are discussed: humans also confabulate and misremember, but unlike LLMs, they learn from feedback and have embodied experience, which many consider a key missing ingredient.