2024-09-10

GPTs and Hallucination

Study methodology & context windows

Some argue the paper’s design is “flawed” because prompts were asked sequentially in one chat, so earlier prompts (e.g., “answer in three words”) bled into later ones.
Others point out the paper also ran isolated sessions to explicitly test context dependence; using both is seen as the core of the experiment, not a mistake.
A subset worries the analysis doesn’t clearly separate those conditions, enabling possible cherry‑picking of results.

What “hallucination” means

Strong disagreement over terminology: “hallucination” vs “bullshitting,” “confabulation,” “bad output,” “misprediction.”
Critics say “hallucinate” anthropomorphizes systems with no beliefs or awareness and obscures that this is just erroneous output.
Supporters say the term is now established, intuitively captures confident fabrication, and is useful for non‑experts.
Several suggest “bullshitting” in the philosophical sense: fluent, confident speech without concern for truth.

Why LLMs hallucinate – and why they work at all

One camp: LLMs are statistical next‑token generators; hallucinations are the inevitable result of prediction under uncertainty and compressed world knowledge.
Another camp says this “just autocomplete” framing is technically true but misleading; internal layers appear to build rich feature/world representations and in‑context learning mechanisms.
Broad agreement: accuracy is high where training data is dense and consensus exists (e.g., popular languages, APIs); errors spike with sparse, fast‑changing, or controversial topics.
Some argue information‑theoretic and complexity limits mean hallucinations can never be fully eliminated.

Intelligence, world models, and limits

Ongoing debate on whether LLMs “have” mental/world models or merely model sources and word co‑occurrences.
Some see emergent capabilities (multimodal reasoning, internal features that track real‑world entities) as steps toward genuine world modeling and even future “minds.”
Others insist they lack self‑knowledge and epistemology: they don’t know when they don’t know.

Mitigation strategies & tooling

Societal and usability concerns

Many are less worried about LLM behavior than about user interpretation and vendor marketing that portrays them as reliable, intelligent agents.
Concern that people over‑trust confident answers, especially without domain knowledge or visible provenance, leading to misuse and misallocation of resources.

Related topics