The most underreported story in AI is that scaling has failed to produce AGI
Debate over Marcus’s Critique of Deep Learning
- Some see him as a long‑time “deep learning can’t do real AI” voice whose goalposts keep moving and who selectively highlights failures.
- Others argue his core criticisms (hallucinations, brittleness, hype) have largely held up and that his consistency is a strength, not a flaw.
- He responds that his earlier “hitting a wall” predictions were mostly accurate and points to public prediction audits.
- Several commenters wish he were less partisan in tone, but still value him as a counterweight to corporate hype.
Scaling, Plateau, and Hallucinations
- Many agree that simple scaling has slowed in payoff since GPT‑3.5: hallucinations persist, reliability is limited, and “agents” work only in narrow domains.
- Some claim hallucinations may be intrinsic to the current next‑token paradigm; others see them as reducible but not yet well controlled.
- Counting letters in words (e.g., “strawberry”) is used as a toy example: critics see persistent failure as evidence of shallow pattern‑matching; defenders say it’s mostly an artifact of tokenization, not a fundamental limit.
LLMs, “Reasoning” Models, and Anthropomorphism
- A minority reports an “inflection point” with new “thinking” models (e.g., chain‑of‑thought + RL search) that feel qualitatively different and more capable at stepwise reasoning.
- Others insist these are still just stacked LLM calls and prompt‑search, not genuine reasoning or agency.
- There’s recurring pushback on anthropomorphizing: chatbots are framed as fictional characters being “acted out” by a document‑completion machine.
Expectations for AGI and Theory
- Multiple commenters note there is no solid theoretical argument that language models should yield AGI, only extrapolation and belief.
- AGI enthusiasm is compared to quasi‑religious or crypto‑like hype; some see “building God” vibes among true believers.
- Others argue simple systems layered at scale produced human intelligence, so dismissing next‑token predictors as “just statistics” is premature.
Cost, Usefulness, and Limits of the Approach
- Critics emphasize petabyte‑scale data, massive GPU and power costs, and limited reliability relative to simple human skills as signs this path is inefficient and maybe fundamentally flawed.
- Supporters reply that even replacing 5–10% of jobs or enabling narrow but reliable agents would be historically huge.
Meta: Polarization and Skepticism
- The thread is seen as polarized into “AI hype train” vs “AI doom/hype skeptic” camps.
- Some argue science should default to skepticism of grand claims; others worry that entrenched partisanship (on both sides) now dominates the discourse.