The "confident idiot" problem: Why AI needs hard rules, not vibe checks

Nature of LLMs: text, not truth

  • Many comments stress that LLMs model word sequences, not facts; they optimize next-token probabilities, not correctness.
  • “Hallucinations” are seen as inevitable: the model always returns something; correctness is judged externally by humans.
  • Determinism (fixed seeds, temp=0) would only make them wrong the same way every time; non‑determinism isn’t the core problem.

Hard rules, validation, and guardrails

  • The article’s proposal (external verifiers/assertions around LLM output) resonates with people building agents: treat the model as an untrusted component and validate like any other input.
  • Suggested tools: schemas/structured output, HTTP checks, type systems, property-based tests, strong typing (Haskell/OCaml/Rust), Prolog/DSL controllers, external scripts and benchmarks, classic validation libraries.
  • Some liken this to pre‑flight checklists or TDD: LLMs handle “soft” generation, deterministic code and tests enforce reality.

Limits and criticisms of the “rules around LLMs” approach

  • Critics note that most high‑stakes tasks (medicine, judgment calls) can’t be fully captured by simple assertions; ultimate verification must be human.
  • Others argue the library still “fixes probability with more probability,” since rules are injected back into prompts the model may ignore.
  • Experience reports: attempts to wrap agents with many verifiers hit reward‑hacking, long tails of missing checks, and inconsistent behavior across repos/languages.

Humans vs LLMs, and world models

  • Large subthreads debate whether LLMs “reason” or “understand” at all, or are just sophisticated text compressors.
  • Multiple commenters emphasize that humans have embodied world models and accountability, whereas LLMs learn only from second‑hand text with no grounding.
  • Counter‑arguments: human knowledge is also error‑ridden; LLMs encode some genuine structure (e.g., numerical patterns) and can approximate aspects of reasoning.

Anthropomorphism, sycophancy, and UX

  • Many dislike the overconfident, flattering style: long answers, fake certainty, reluctance to say “I don’t know.”
  • This is widely attributed to RLHF and training data (Q&A, SEO content, Reddit), not inherent model limits.
  • Several users want models that ask clarifying questions, behave more like cautious tools, or adopt explicitly robotic, non‑human personas.