Overcoming the limits of current LLMs

Training data, licensing, and moats

  • Many see high‑quality, “tidy”, properly licensed data as the real moat: harder than scaling compute and scraping the web.
  • Exclusive content deals (e.g., major news outlets) are viewed as anti‑competitive and pushing “technofeudal” dynamics where capital wins regardless of legal stance on scraping.
  • Without major media, random forum posts become over‑represented, which some find darkly amusing but also concerning.

Nature and terminology of hallucinations

  • Strong debate over the term “hallucination”: alternatives proposed include “incoherent output”, “confabulation”, and “bullshitting”.
  • Some argue “hallucination” wrongly implies perceptual errors and human‑like minds; others say it’s already widely understood and language is flexible.
  • Several commenters stress that LLMs are always generating statistically plausible text, not tracking truth; “some outputs happen to be true” rather than the model caring about correctness.

Can better corpora fix hallucinations?

  • Skeptics: even a perfect corpus can’t eliminate hallucinations, especially under stochastic sampling (temperature > 0) and for domains like math where generalization, not memorization, is needed.
  • Optimists: more consistent, higher‑quality data (as in Phi‑style training) can reduce error rates, though building such corpora at scale may be practically impossible.
  • There is concern that the article underestimates contradictions in science itself and overestimates the existence of a “universally coherent” dataset.

Logic, reasoning, and AGI limits

  • Many note LLMs still struggle with counting, arithmetic, and formal logic; some see this as evidence they can’t directly scale into AGI.
  • Others argue LLMs can be components in larger systems with planners, code execution, or search (e.g., MCTS, program synthesis), even if they aren’t planners themselves.
  • Undecidability, complexity theory, and limits of automated theorem proving are cited as deeper obstacles to “perfect reasoning”.

Techniques to reduce or work around hallucinations

  • Popular mitigation ideas:
    • Using multiple models or multiple samples plus a discriminator/voting to detect and resample bad answers.
    • RAG and external tools/APIs, though vector search alone is seen as insufficient, especially for structured data.
    • Agentic systems that run code, interact with environments, and get feedback reportedly reduce hallucinations in practice.
    • Training models to detect logical fallacies is suggested but viewed as hard, given current failures at basic tasks like counting.

Practical use cases and changing workflows

  • Several commenters report large productivity gains with current models (GPT‑4o, Claude), particularly for:
    • Test‑driven development (letting the LLM generate tests and refactor code).
    • Socratic brainstorming and fleshing out half‑formed ideas.
    • Acting as a diligent “junior dev” for boilerplate tasks.
  • Key pattern: stop treating the LLM as a one‑shot oracle; instead use iterative dialogue, self‑tests, and external verification.

Philosophical and societal questions

  • Some argue hallucination is inherent to intelligence seen as lossy compression, and that humans themselves have incoherent world models.
  • Others question whether pushing better chatbots actually makes the world better, expressing unease about talent and capital flowing into this area.
  • There is curiosity but no consensus on whether hallucination is in any sense a “milestone toward consciousness” (largely left as an open, unclear question).