A conversation about AI for science with Jason Pruet
Role of DOE/National Labs and Context of the Interview
- Many see the piece as partly PR but still useful context on how DOE and national labs frame “AI for science” and national security.
- Labs are under strong top-down pressure to “do AI,” following an earlier public-cloud push; some view this as creating long-term rent streams for cloud vendors.
- DOE already runs major GPU-based exascale supercomputers and plans to provide infrastructure to universities.
- Some technical gripes surface about working on classified systems (no binaries, no PyTorch, awkward FIPS constraints), contributing to earlier reliance on tools like Mathematica.
- Separate note: “1663” is explained as LANL’s science-and-tech magazine, named after its WWII PO box.
- One comment mentions recent heavy LANL layoffs and a sense of anxiety inside the lab.
Public–Private Partnerships, Capture, and Tech Transfer
- Central worry: if labs depend on industry for frontier models and compute, public research could be “captured” by commercial agendas—data, methods, and IP effectively controlled by a few firms.
- Some view this as part of a broader pattern of privatizing state capacity (real estate, missile defense as a subscription service, etc.), leaving government structurally dependent on contractors.
- Others argue this is exactly what tech transfer is for: public R&D → private commercialization, which historically enabled much of today’s tech stack.
- Counterpoint: that logic assumes fair, transparent processes; critics doubt that’ll hold when a handful of AI firms control crucial infrastructure.
- There is sharp distrust of defense contractors and “entrepreneurs” seen as driving cost inflation, fraud, and lock-in.
- Some call for AI critical to national security to be kept entirely inside DoD, without private IP; others stress that labs do work with a wide range of companies, including startups.
AI vs. Climate, Energy, and National Priorities
- One camp: fix global warming and build clean energy first; worries that AI hype diverts capital, power, and political attention.
- Another: you can and must do both; delaying AI means trailing other economies and militaries. Historical analogies (Manhattan Project, Space Race) are invoked for large-scale tech investment.
- Debate on whether AI meaningfully accelerates climate solutions (optimization, power efficiency, planning) or is mostly a distraction that increases energy demand.
- Some argue a rapid build-out of renewables, storage, EVs, and heat pumps could halve emissions without AI; others emphasize that electricity is only a fraction of total emissions, with hard tradeoffs elsewhere.
- Subthread on “degrowth”: some say reduced per-capita energy use is necessary; others insist increasing energy use is core to progress (Kardashev-style thinking).
- Nationalism and geopolitics recur: concerns about “China will win if we slow AI” vs arguments that competitive framing itself worsens both AI risk and climate risk.
What “AI for Science” Means
- Several comments note the article’s “AI for science” framing isn’t just about LLMs; examples like AlphaFold or geometry-proving systems are cited as more emblematic scientific advances.
- Some readers suspect that current AI enthusiasm at LANL and DOE is being politically driven (“because they’re being told to”) and worry about administration-level capture by tech interests.
- Others see real promise if national labs focus on scientific applications—simulation, materials, climate modeling—rather than primarily chasing commercial generative AI.
Benchmarks, Hype, and Real-World Performance
- Interview claims about AI surpassing humans on almost all benchmarks are widely challenged.
- Commenters note that tools like Gemini can look impressive in short “play” sessions yet fail unpredictably on simple tasks, hallucinate, or produce plausible nonsense.
- There is frustration that current benchmarks underweight reliability, non-hallucination, and long-horizon reasoning—areas where humans still excel.
- Some think LLM progress is already plateauing and that parameter-scaling gives diminishing returns; others argue agent capabilities are clearly improving even if raw knowledge isn’t exploding anymore.
- A meta-critique emerges: optimists accuse skeptics of “performative cynicism” and moving goalposts; skeptics say claims of inevitable rapid improvement are marketing, not evidence.
Are LLMs “Good Coders”?
- Strong disagreement about the statement that modern models are “very good coders.”
- Supportive view:
- They’re excellent at syntax, boilerplate, pattern recall, quick prototypes, and occasionally at insights that would take humans far longer.
- They can transform workflows for developers who know enough to validate output.
- Critical view:
- They lack domain understanding, don’t know requirements, can’t judge ticket quality, and don’t understand cross-team impacts, so they’re autocomplete tools, not programmers.
- Their failures are unpredictable: sometimes flounder on trivial tasks, sometimes nail advanced ones.
- They exhibit a kind of “systematic Dunning–Kruger”: confidently wrong, always producing something instead of admitting ignorance.
- Many see them as useful assistants when you can rapidly check their work, but not reliable enough to own end-to-end software tasks.
- There is also skepticism about claims of being “very good legal analysts,” which some find especially implausible.
Governance, AI Futures, and General Mood
- Some prefer commercially driven AI development over state/military-led programs; others argue that once research and infrastructure depend on private platforms, control shifts dangerously fast.
- A playful but serious tangent speculates about AI CEOs: one commenter extrapolates from current agent time-horizon benchmarks to predict AI-led firms in the 2030s, while others treat this as premature.
- The thread ends with a mix of awe and dread: AI feels like it could usher in a renaissance or a breakdown, and many commenters explicitly say it’s unclear which path we’re on.