Overcoming the limits of current LLMs
Training data, licensing, and moats
- Many see high‑quality, “tidy”, properly licensed data as the real moat: harder than scaling compute and scraping the web.
- Exclusive content deals (e.g., major news outlets) are viewed as anti‑competitive and pushing “technofeudal” dynamics where capital wins regardless of legal stance on scraping.
- Without major media, random forum posts become over‑represented, which some find darkly amusing but also concerning.
Nature and terminology of hallucinations
- Strong debate over the term “hallucination”: alternatives proposed include “incoherent output”, “confabulation”, and “bullshitting”.
- Some argue “hallucination” wrongly implies perceptual errors and human‑like minds; others say it’s already widely understood and language is flexible.
- Several commenters stress that LLMs are always generating statistically plausible text, not tracking truth; “some outputs happen to be true” rather than the model caring about correctness.
Can better corpora fix hallucinations?
- Skeptics: even a perfect corpus can’t eliminate hallucinations, especially under stochastic sampling (temperature > 0) and for domains like math where generalization, not memorization, is needed.
- Optimists: more consistent, higher‑quality data (as in Phi‑style training) can reduce error rates, though building such corpora at scale may be practically impossible.
- There is concern that the article underestimates contradictions in science itself and overestimates the existence of a “universally coherent” dataset.
Logic, reasoning, and AGI limits
- Many note LLMs still struggle with counting, arithmetic, and formal logic; some see this as evidence they can’t directly scale into AGI.
- Others argue LLMs can be components in larger systems with planners, code execution, or search (e.g., MCTS, program synthesis), even if they aren’t planners themselves.
- Undecidability, complexity theory, and limits of automated theorem proving are cited as deeper obstacles to “perfect reasoning”.
Techniques to reduce or work around hallucinations
- Popular mitigation ideas:
- Using multiple models or multiple samples plus a discriminator/voting to detect and resample bad answers.
- RAG and external tools/APIs, though vector search alone is seen as insufficient, especially for structured data.
- Agentic systems that run code, interact with environments, and get feedback reportedly reduce hallucinations in practice.
- Training models to detect logical fallacies is suggested but viewed as hard, given current failures at basic tasks like counting.
Practical use cases and changing workflows
- Several commenters report large productivity gains with current models (GPT‑4o, Claude), particularly for:
- Test‑driven development (letting the LLM generate tests and refactor code).
- Socratic brainstorming and fleshing out half‑formed ideas.
- Acting as a diligent “junior dev” for boilerplate tasks.
- Key pattern: stop treating the LLM as a one‑shot oracle; instead use iterative dialogue, self‑tests, and external verification.
Philosophical and societal questions
- Some argue hallucination is inherent to intelligence seen as lossy compression, and that humans themselves have incoherent world models.
- Others question whether pushing better chatbots actually makes the world better, expressing unease about talent and capital flowing into this area.
- There is curiosity but no consensus on whether hallucination is in any sense a “milestone toward consciousness” (largely left as an open, unclear question).