John Carmack talk at Upper Bound 2025

Scope and Setup of Carmack’s Project

  • Built an Atari-playing physical robot using camera input and joystick actuators, trained online in real time on a laptop GPU.
  • Emphasis is on generic methods, continual learning, sample efficiency, and robustness to physical issues (latency, noisy/“phantom” inputs, actuator wear), not just “solving Atari.”
  • Some see it as a useful constrained testbed for problems that appear in robotics (real-time control, catastrophic forgetting); others argue similar work in simulation and robotics (e.g., by GPU/robotics vendors, self‑driving stacks) already addresses these.

Atari, RL, and Generalization

  • Atari was historically a core RL benchmark and largely “solved” in emulators; multiple commenters argue that didn’t yield broadly useful, general algorithms.
  • A line of criticism: individual Atari games are low‑dimensional; tiny models plus hand‑crafted tricks can do well, so “progress” often reflects researcher priors rather than genuine general intelligence.
  • Counterpoint: revisiting Atari with realtime constraints, physical controllers, and multi‑game continuity remains valuable for studying transfer and catastrophic forgetting (game A performance shouldn't collapse after training on game B).
  • Several note that humans rapidly transfer game concepts and UI patterns across games; current RL systems mostly do not.

Continuous Learning, Memory, and Human vs LLM Cognition

  • Debate over the “missing ingredient”: proposals include continuous lifelong learning, better memory systems, and richer physical environments.
  • One side stresses that humans constantly adapt, filter input, and retain key experiences over long timescales; current models largely don’t update weights online in this way.
  • Others argue most impactful human memories are sparse “surprise/arousal” events, implying that a well‑designed persistent memory + context management system might suffice for many tasks.
  • Skepticism that large context windows and vector DBs alone are enough for robust real‑world agents; issues with forgetting, retrieval, and lack of autonomous weight updates are highlighted.

Embodied Intelligence vs LLM “Blender” Pretraining

  • Carmack explicitly contrasts learning from a stream of interactive experience with “throw‑everything‑in‑a‑blender” LLM pretraining.
  • Some agree that embodied, interactive learning is crucial for AGI or for genuine concept formation and physical competence.
  • Others note that frontier models are already multimodal (text, audio, images, video) and that massive pretraining plus RL in rich simulations may scale better than slow physical training.
  • There’s concern that because pretraining is so effective and commercially valuable, interactive‑learning research may be underfunded despite its conceptual importance.

Carmack’s Role and Prospects

  • Many express excitement and trust in his track record of doing more with less and extracting maximal performance from commodity hardware.
  • Skeptics question whether past graphics/engine brilliance translates to leading AI research in a crowded, math‑heavy, hyper‑competitive field.
  • Several suggest his biggest potential impact may be in systems, optimization, and tooling (e.g., more efficient GPU stacks) rather than novel learning theory per se.