The "confident idiot" problem: Why AI needs hard rules, not vibe checks
Nature of LLMs: text, not truth
- Many comments stress that LLMs model word sequences, not facts; they optimize next-token probabilities, not correctness.
- “Hallucinations” are seen as inevitable: the model always returns something; correctness is judged externally by humans.
- Determinism (fixed seeds, temp=0) would only make them wrong the same way every time; non‑determinism isn’t the core problem.
Hard rules, validation, and guardrails
- The article’s proposal (external verifiers/assertions around LLM output) resonates with people building agents: treat the model as an untrusted component and validate like any other input.
- Suggested tools: schemas/structured output, HTTP checks, type systems, property-based tests, strong typing (Haskell/OCaml/Rust), Prolog/DSL controllers, external scripts and benchmarks, classic validation libraries.
- Some liken this to pre‑flight checklists or TDD: LLMs handle “soft” generation, deterministic code and tests enforce reality.
Limits and criticisms of the “rules around LLMs” approach
- Critics note that most high‑stakes tasks (medicine, judgment calls) can’t be fully captured by simple assertions; ultimate verification must be human.
- Others argue the library still “fixes probability with more probability,” since rules are injected back into prompts the model may ignore.
- Experience reports: attempts to wrap agents with many verifiers hit reward‑hacking, long tails of missing checks, and inconsistent behavior across repos/languages.
Humans vs LLMs, and world models
- Large subthreads debate whether LLMs “reason” or “understand” at all, or are just sophisticated text compressors.
- Multiple commenters emphasize that humans have embodied world models and accountability, whereas LLMs learn only from second‑hand text with no grounding.
- Counter‑arguments: human knowledge is also error‑ridden; LLMs encode some genuine structure (e.g., numerical patterns) and can approximate aspects of reasoning.
Anthropomorphism, sycophancy, and UX
- Many dislike the overconfident, flattering style: long answers, fake certainty, reluctance to say “I don’t know.”
- This is widely attributed to RLHF and training data (Q&A, SEO content, Reddit), not inherent model limits.
- Several users want models that ask clarifying questions, behave more like cautious tools, or adopt explicitly robotic, non‑human personas.