If you believe in "Artificial Intelligence", take five minutes to ask it

Limits of LLM Knowledge & Hallucinations

  • Many comments agree the article’s core point is valid: LLMs are often wrong on niche, non-verifiable factual questions and will confidently invent details.
  • Several note this is unsurprising given finite model size and lossy compression of training data. Expecting encyclopedic, perfectly reliable recall is seen as a category error.
  • Some argue that if a system is only safe to use when the answer can be independently checked, its value is limited for lay users who can’t verify.

Intelligence vs Knowledge (and Metacognition)

  • Strong debate over whether current LLMs exhibit “intelligence” or are merely large statistical parrots.
  • One camp: LLMs predict tokens; they don’t reason, lack metacognition, and don’t “know that they don’t know,” so outputs must be treated like hearsay.
  • Another camp: prediction is a form of reasoning; internal representations and emergent “reasoning-style” behavior suggest a primitive world model.
  • Multiple examples show models sometimes honestly say “I don’t know,” but skeptics call this surface-level roleplay, not genuine uncertainty.

Appropriate Use Cases & Verification

  • Widely reported sweet spots:
    • Coding (snippets, refactors, bug hints) where compilers/tests verify output.
    • Summarization, extraction, rewriting, translation, itinerary planning, and brainstorming.
  • Many only trust LLMs where correctness is self-evident (code runs, script behaves, travel plan is reviewable) and avoid them for unverified summaries, medical notes, or legal/technical reports.
  • A repeated warning: the most dangerous zone is ~90% correctness with no easy way to tell which 10% is wrong.

Model Progress, RAG, and Tooling

  • Several reproduce the dinosaur-taxonomy question with newer models (or with web/search integration) and get largely correct, nuanced answers.
  • This is used to argue the article is already dated and that real value comes from combining LLMs with retrieval (RAG/GraphRAG) and search, not from raw pretrained models alone.

Hype, Marketing, and Societal Risk

  • Some are tired of “LLMs hallucinate niche trivia” posts; others say such critiques remain crucial because LLMs are marketed and perceived as general knowledge oracles.
  • Comparisons split between “another iPhone moment” and “another crypto bubble”; skepticism focuses on overhyped claims, regulatory capture, and use in surveillance and automation.
  • Epistemology and human parallels are discussed: humans also confabulate and misremember, but unlike LLMs, they learn from feedback and have embodied experience, which many consider a key missing ingredient.