Diffusion for World Modeling
Overall reaction
- Many find the demo striking and “dreamlike,” with some saying it’s the first paper in a while that makes them want new GPUs.
- Others see it as mostly a cool proof‑of‑concept with limited direct usefulness in its current form.
Use in games and graphics
- Some predict most game graphics will move to diffusion‑based rendering within a few years, enabling photorealism and “limitless physics.”
- Skeptics argue entire games won’t be run by ML: engines need stable, debuggable rules, not “dream logic.”
- More moderate views: ML is likely for subsystems—rendering, upscaling, animation, NPC behavior—rather than full game state.
- Several see near‑term value as a “skin” or remaster layer over existing low‑fidelity games, similar in spirit to DLSS/RTX Remix.
World models, RL, and robotics
- Commenters stress the real target is general world models for autonomous agents, not recreating Counter‑Strike.
- Video game environments are used as cheap, controllable testbeds; the same methods could be trained on real‑world video + sensor data.
- In RL, such models let agents “imagine” consequences instead of acting directly in the world.
Prediction vs understanding
- Long subthread debates whether neural nets “only predict” or can “understand.”
- One side equates scientific understanding with curve‑fitted predictive models; the other insists human‑style abstraction and generalization differs from current ML behavior.
- Disagreements focus on conservation laws, historical scientific discovery, and whether future models could reach human‑level insight.
Limitations and technical concerns
- Current model has poor long‑term consistency and almost no explicit map or state awareness; walking into walls or doing unusual actions produces plausible but wrong “gibberish.”
- Memory is effectively just recent frames + inputs; world continuity and inventory/state tracking are weak.
- Performance is heavy: high‑end GPUs, low resolution, and modest FPS.
- For physics, some suggest ML approximations for complex phenomena (fluids, explosions, lighting), but others note determinism, debuggability, and multiplayer consistency concerns.
Dreamlike aesthetics and cognition parallels
- Many note the uncanny, noisy, shifting visuals resemble dreams or psychedelic experiences.
- Some speculate human dreams and perception might share structural similarities with diffusion‑style generative processes, though this remains speculative within the thread.