A conversation about AI for science with Jason Pruet

Role of DOE/National Labs and Context of the Interview

  • Many see the piece as partly PR but still useful context on how DOE and national labs frame “AI for science” and national security.
  • Labs are under strong top-down pressure to “do AI,” following an earlier public-cloud push; some view this as creating long-term rent streams for cloud vendors.
  • DOE already runs major GPU-based exascale supercomputers and plans to provide infrastructure to universities.
  • Some technical gripes surface about working on classified systems (no binaries, no PyTorch, awkward FIPS constraints), contributing to earlier reliance on tools like Mathematica.
  • Separate note: “1663” is explained as LANL’s science-and-tech magazine, named after its WWII PO box.
  • One comment mentions recent heavy LANL layoffs and a sense of anxiety inside the lab.

Public–Private Partnerships, Capture, and Tech Transfer

  • Central worry: if labs depend on industry for frontier models and compute, public research could be “captured” by commercial agendas—data, methods, and IP effectively controlled by a few firms.
  • Some view this as part of a broader pattern of privatizing state capacity (real estate, missile defense as a subscription service, etc.), leaving government structurally dependent on contractors.
  • Others argue this is exactly what tech transfer is for: public R&D → private commercialization, which historically enabled much of today’s tech stack.
  • Counterpoint: that logic assumes fair, transparent processes; critics doubt that’ll hold when a handful of AI firms control crucial infrastructure.
  • There is sharp distrust of defense contractors and “entrepreneurs” seen as driving cost inflation, fraud, and lock-in.
  • Some call for AI critical to national security to be kept entirely inside DoD, without private IP; others stress that labs do work with a wide range of companies, including startups.

AI vs. Climate, Energy, and National Priorities

  • One camp: fix global warming and build clean energy first; worries that AI hype diverts capital, power, and political attention.
  • Another: you can and must do both; delaying AI means trailing other economies and militaries. Historical analogies (Manhattan Project, Space Race) are invoked for large-scale tech investment.
  • Debate on whether AI meaningfully accelerates climate solutions (optimization, power efficiency, planning) or is mostly a distraction that increases energy demand.
  • Some argue a rapid build-out of renewables, storage, EVs, and heat pumps could halve emissions without AI; others emphasize that electricity is only a fraction of total emissions, with hard tradeoffs elsewhere.
  • Subthread on “degrowth”: some say reduced per-capita energy use is necessary; others insist increasing energy use is core to progress (Kardashev-style thinking).
  • Nationalism and geopolitics recur: concerns about “China will win if we slow AI” vs arguments that competitive framing itself worsens both AI risk and climate risk.

What “AI for Science” Means

  • Several comments note the article’s “AI for science” framing isn’t just about LLMs; examples like AlphaFold or geometry-proving systems are cited as more emblematic scientific advances.
  • Some readers suspect that current AI enthusiasm at LANL and DOE is being politically driven (“because they’re being told to”) and worry about administration-level capture by tech interests.
  • Others see real promise if national labs focus on scientific applications—simulation, materials, climate modeling—rather than primarily chasing commercial generative AI.

Benchmarks, Hype, and Real-World Performance

  • Interview claims about AI surpassing humans on almost all benchmarks are widely challenged.
  • Commenters note that tools like Gemini can look impressive in short “play” sessions yet fail unpredictably on simple tasks, hallucinate, or produce plausible nonsense.
  • There is frustration that current benchmarks underweight reliability, non-hallucination, and long-horizon reasoning—areas where humans still excel.
  • Some think LLM progress is already plateauing and that parameter-scaling gives diminishing returns; others argue agent capabilities are clearly improving even if raw knowledge isn’t exploding anymore.
  • A meta-critique emerges: optimists accuse skeptics of “performative cynicism” and moving goalposts; skeptics say claims of inevitable rapid improvement are marketing, not evidence.

Are LLMs “Good Coders”?

  • Strong disagreement about the statement that modern models are “very good coders.”
  • Supportive view:
    • They’re excellent at syntax, boilerplate, pattern recall, quick prototypes, and occasionally at insights that would take humans far longer.
    • They can transform workflows for developers who know enough to validate output.
  • Critical view:
    • They lack domain understanding, don’t know requirements, can’t judge ticket quality, and don’t understand cross-team impacts, so they’re autocomplete tools, not programmers.
    • Their failures are unpredictable: sometimes flounder on trivial tasks, sometimes nail advanced ones.
    • They exhibit a kind of “systematic Dunning–Kruger”: confidently wrong, always producing something instead of admitting ignorance.
  • Many see them as useful assistants when you can rapidly check their work, but not reliable enough to own end-to-end software tasks.
  • There is also skepticism about claims of being “very good legal analysts,” which some find especially implausible.

Governance, AI Futures, and General Mood

  • Some prefer commercially driven AI development over state/military-led programs; others argue that once research and infrastructure depend on private platforms, control shifts dangerously fast.
  • A playful but serious tangent speculates about AI CEOs: one commenter extrapolates from current agent time-horizon benchmarks to predict AI-led firms in the 2030s, while others treat this as premature.
  • The thread ends with a mix of awe and dread: AI feels like it could usher in a renaissance or a breakdown, and many commenters explicitly say it’s unclear which path we’re on.