2024-10-25

Detecting when LLMs are uncertain

Sampling, branching, and “thinking tokens”

Several comments liken Entropix-style decoding to maze traversal or search (beam search, MCTS), where extra compute explores alternative token paths.
Some see richer samplers as aligned with the “more compute/search wins” view, possibly similar to what big labs do for reasoning models.
Others argue there are already “billions” of sampling schemes; it’s very hard to show any is clearly better than standard top‑k/top‑p without strong benchmarks.
Thinking/“reasoning” tokens are viewed as an interesting but somewhat ad‑hoc idea; some prefer mathematically grounded methods like MCTS.

Entropy, varentropy, and uncertainty estimation

Critics say Entropix misuses information‑theoretic terms; per‑token entropy of model logits is not the true entropy of the underlying sequence distribution.
They warn against slapping “entropy/varentropy” on heuristic scores without clear theory or math, and note tradeoffs: reducing hallucinations likely reduces output diversity.
Others point to “semantic entropy” work and broader surveys/benchmarks of LLM uncertainty methods, finding that sophisticated semantic clustering sometimes helps, but simple baselines (e.g., average token entropy) can perform similarly.
Bayesian neural nets and other formal uncertainty approaches exist but are compute‑heavy and hard to train.

Escape hatches and abstention

Multiple commenters want APIs and samplers that expose uncertainty and can trigger “I’m not sure” or rejection/abstention instead of forced answers.
This is especially desired for agents, RAG hallucination detection, and data‑structuring tasks that need per‑field confidence.
Rejection‑verification curves are highlighted as a standard way to evaluate whether an uncertainty score actually tracks output quality.

Debates on LLM “certainty” and understanding

One camp insists LLMs are just statistical text models with no world model, intent, or genuine certainty; “confidence” is purely human interpretation of probabilities.
Others counter that internal activations correlate with truthfulness/uncertainty and that, functionally, this behaves like a form of confidence, regardless of consciousness.
There’s extended debate over anthropomorphic terms like “hallucination,” with alternatives like “confabulation” or simply “wrong/inaccurate” suggested.

Trust, applications, and evaluation

Some distrust LLMs for autonomous actions, arguing that every output is fundamentally a guess.
Others report strong practical success (e.g., non‑programmers building production scripts), while emphasizing human oversight.
Overall sentiment: detecting and using uncertainty is valuable but technically hard; Entropix‑style methods are seen as intriguing yet unproven without rigorous, task‑level benchmarks.

Related topics