Ilya Sutskever NeurIPS talk [video]

Peak data & limits of current scaling

  • Multiple commenters focus on the claim that “pre‑training as we know it will end” because we’ve hit “peak data.”
  • Some see this as an important public acknowledgment that increasing model size + internet-scale data no longer guarantees easy gains.
  • Others argue we haven’t exhausted what can be learned from existing data; current methods are inefficient at extracting knowledge.

Future training data sources

  • Suggestions include proprietary corpora (e.g., news, books, pharma, energy, internal codebases) where owners can sidestep copyright issues.
  • Ideas for new data generation: robots in the real world, continuous learning from users, self‑driving logs, surveillance video, XR/smart glasses, personal telemetry (keylogging, screenshots, etc.).
  • Some propose large-scale book scanning or reviving old digitization projects.
  • Concern that many remaining rich datasets are locked in commercial silos and will stay closed.

Synthetic data: usefulness debated

  • One camp claims synthetic datasets are mostly useless beyond narrow cases; better to re‑use real data.
  • Others counter that major labs report strong gains from synthetic data and that the question is unsettled.
  • It’s noted that the talk itself is skeptical about synthetic data, but commenters say he may be wrong.

Domain‑specific models and expert work

  • Lively thread on “state law LLMs” and narrow experts:
    • Supporters think curated, smaller domains (law, proprietary code, niche languages) can yield near‑expert models and commoditize expertise, reducing demand for specialists.
    • Critics argue law in particular depends on real‑world context, messy incentives, and high stakes; LLM‑grade answers are risky when errors are costly.
    • Parallel drawn to code: LLMs already help non‑experts, but their outputs still need human review.

Reasoning, agents, and unpredictability

  • Discussion on “agentic intelligence” as models that set goals, plan, and act autonomously, versus today’s answer‑only systems.
  • Some agree with the claim that “more reasoning is more unpredictable,” linking useful reasoning to non‑obvious, hard‑to‑anticipate outputs.
  • Others push back, saying reasoning is in principle deterministic; unpredictability is about our limited ability to follow it.

Self‑awareness and meaning

  • Extended debate on whether current models are “self‑aware” in any meaningful sense:
    • One side points to models’ ability to talk about themselves and adapt behavior as trivial self‑awareness.
    • The other insists this is just pattern completion from instruction tuning, with no genuine intent or inner experience.
  • Philosophical arguments invoke the Chinese room, “theory of mind,” and whether meaning exists without observers.

Biology analogies & brain/body scaling

  • Several criticize references to “neurons” and brain–body mass ratios:
    • Biological neurons are biophysically very different from transformer units.
    • Brain/body ratio is a noisy correlate of intelligence; examples like birds or ants complicate simple scaling stories.
  • Others defend loose analogy as historically useful inspiration, even if not biologically faithful.

Talk quality, context, and hype

  • Many find the talk underwhelming or “fluffy,” saying it offered little new to people following the field and leaned on grand, speculative tones.
  • Clarification: this was a NeurIPS “Test of Time” award talk about a 2014 paper, partly retrospective rather than a new technical result.
  • Some note a pattern of overly optimistic timelines from prominent figures, attributing this partly to fundraising incentives.
  • Broader concern that NeurIPS and AI discourse are increasingly dominated by “bros,” grifters, and hype, overshadowing careful research.

Ethics, environment, and resource analogies

  • The “internet as oil” metaphor is read by some as an admission of extractive business models.
  • Environmental worries surface around compute and data center water use (“boiling lakes”).
  • A few raise the prospect that early powerful AIs will effectively be slaves and warn about delayed recognition of their moral status.