Ilya Sutskever NeurIPS talk [video]
Peak data & limits of current scaling
- Multiple commenters focus on the claim that “pre‑training as we know it will end” because we’ve hit “peak data.”
- Some see this as an important public acknowledgment that increasing model size + internet-scale data no longer guarantees easy gains.
- Others argue we haven’t exhausted what can be learned from existing data; current methods are inefficient at extracting knowledge.
Future training data sources
- Suggestions include proprietary corpora (e.g., news, books, pharma, energy, internal codebases) where owners can sidestep copyright issues.
- Ideas for new data generation: robots in the real world, continuous learning from users, self‑driving logs, surveillance video, XR/smart glasses, personal telemetry (keylogging, screenshots, etc.).
- Some propose large-scale book scanning or reviving old digitization projects.
- Concern that many remaining rich datasets are locked in commercial silos and will stay closed.
Synthetic data: usefulness debated
- One camp claims synthetic datasets are mostly useless beyond narrow cases; better to re‑use real data.
- Others counter that major labs report strong gains from synthetic data and that the question is unsettled.
- It’s noted that the talk itself is skeptical about synthetic data, but commenters say he may be wrong.
Domain‑specific models and expert work
- Lively thread on “state law LLMs” and narrow experts:
- Supporters think curated, smaller domains (law, proprietary code, niche languages) can yield near‑expert models and commoditize expertise, reducing demand for specialists.
- Critics argue law in particular depends on real‑world context, messy incentives, and high stakes; LLM‑grade answers are risky when errors are costly.
- Parallel drawn to code: LLMs already help non‑experts, but their outputs still need human review.
Reasoning, agents, and unpredictability
- Discussion on “agentic intelligence” as models that set goals, plan, and act autonomously, versus today’s answer‑only systems.
- Some agree with the claim that “more reasoning is more unpredictable,” linking useful reasoning to non‑obvious, hard‑to‑anticipate outputs.
- Others push back, saying reasoning is in principle deterministic; unpredictability is about our limited ability to follow it.
Self‑awareness and meaning
- Extended debate on whether current models are “self‑aware” in any meaningful sense:
- One side points to models’ ability to talk about themselves and adapt behavior as trivial self‑awareness.
- The other insists this is just pattern completion from instruction tuning, with no genuine intent or inner experience.
- Philosophical arguments invoke the Chinese room, “theory of mind,” and whether meaning exists without observers.
Biology analogies & brain/body scaling
- Several criticize references to “neurons” and brain–body mass ratios:
- Biological neurons are biophysically very different from transformer units.
- Brain/body ratio is a noisy correlate of intelligence; examples like birds or ants complicate simple scaling stories.
- Others defend loose analogy as historically useful inspiration, even if not biologically faithful.
Talk quality, context, and hype
- Many find the talk underwhelming or “fluffy,” saying it offered little new to people following the field and leaned on grand, speculative tones.
- Clarification: this was a NeurIPS “Test of Time” award talk about a 2014 paper, partly retrospective rather than a new technical result.
- Some note a pattern of overly optimistic timelines from prominent figures, attributing this partly to fundraising incentives.
- Broader concern that NeurIPS and AI discourse are increasingly dominated by “bros,” grifters, and hype, overshadowing careful research.
Ethics, environment, and resource analogies
- The “internet as oil” metaphor is read by some as an admission of extractive business models.
- Environmental worries surface around compute and data center water use (“boiling lakes”).
- A few raise the prospect that early powerful AIs will effectively be slaves and warn about delayed recognition of their moral status.