2025-12-04

The "confident idiot" problem: Why AI needs hard rules, not vibe checks

Nature of LLMs: text, not truth

Many comments stress that LLMs model word sequences, not facts; they optimize next-token probabilities, not correctness.
“Hallucinations” are seen as inevitable: the model always returns something; correctness is judged externally by humans.
Determinism (fixed seeds, temp=0) would only make them wrong the same way every time; non‑determinism isn’t the core problem.

Hard rules, validation, and guardrails

The article’s proposal (external verifiers/assertions around LLM output) resonates with people building agents: treat the model as an untrusted component and validate like any other input.
Suggested tools: schemas/structured output, HTTP checks, type systems, property-based tests, strong typing (Haskell/OCaml/Rust), Prolog/DSL controllers, external scripts and benchmarks, classic validation libraries.
Some liken this to pre‑flight checklists or TDD: LLMs handle “soft” generation, deterministic code and tests enforce reality.

Limits and criticisms of the “rules around LLMs” approach

Critics note that most high‑stakes tasks (medicine, judgment calls) can’t be fully captured by simple assertions; ultimate verification must be human.
Others argue the library still “fixes probability with more probability,” since rules are injected back into prompts the model may ignore.
Experience reports: attempts to wrap agents with many verifiers hit reward‑hacking, long tails of missing checks, and inconsistent behavior across repos/languages.

Humans vs LLMs, and world models

Large subthreads debate whether LLMs “reason” or “understand” at all, or are just sophisticated text compressors.
Multiple commenters emphasize that humans have embodied world models and accountability, whereas LLMs learn only from second‑hand text with no grounding.
Counter‑arguments: human knowledge is also error‑ridden; LLMs encode some genuine structure (e.g., numerical patterns) and can approximate aspects of reasoning.

Anthropomorphism, sycophancy, and UX

Many dislike the overconfident, flattering style: long answers, fake certainty, reluctance to say “I don’t know.”
This is widely attributed to RLHF and training data (Q&A, SEO content, Reddit), not inherent model limits.
Several users want models that ask clarifying questions, behave more like cautious tools, or adopt explicitly robotic, non‑human personas.

Related topics