El Capitan: New supercomputer is the fastest
Purpose and Role of El Capitan
- Built at Lawrence Livermore National Laboratory using AMD MI300A APU-based nodes interconnected by HPE’s Slingshot.
- Officially used to model nuclear weapon performance, aging, and safety, replacing live tests under test-ban regimes.
- Also expected to support other HPC workloads like fusion research, genomics, and fundamental simulations.
Nuclear Weapons, Deterrence, and Ethics
- Some commenters are disturbed that leading-edge compute is driven by nuclear weapons work, especially amid geopolitical tensions and stalled disarmament.
- Others argue supercomputer simulations are preferable to live nuclear testing and are essential for stockpile stewardship and credible deterrence.
- Clarified that modern work focuses more on reliability, safety, aging, and variable-yield designs than on ever-higher explosive yields.
- Concern raised that some states might neglect stewardship, risking discovering “use-by dates” on warheads only in crisis.
Why Nuclear Simulations Need Massive Compute
- Simulations couple many demanding domains: radiation and neutron transport, hydrodynamics, plasma physics, high-temperature chemistry, and aging effects.
- Extremely small time scales (nanoseconds), extreme conditions (pressures, temperatures, plasmas), and 3D modeling needs drive complexity.
- Codes often run large ensembles (uncertainty quantification, sensitivity analysis).
- There is debate over whether they simulate down to subatomic particles; consensus in thread is that full per-particle modeling is infeasible and heavy approximations are required.
Hardware, Performance, and Precision
- El Capitan is a significant win for AMD in exascale HPC, contrasting with less successful competing efforts.
- Discussion on FP64 vs lower-precision compute: nuclear/HPC workloads need high precision, unlike LLM training, which tolerates FP16/FP8.
- AI training clusters (e.g., tens of thousands of H100s) may now exceed national labs in raw (low-precision) FLOPs, but workloads and metrics are not directly comparable.
Topology, Secrecy, and Alternatives
- Key differentiator of supercomputers is low-latency, high-bandwidth interconnects and specialized topologies; many scientific codes are tightly coupled and not “embarrassingly parallel.”
- Distributed volunteer projects (Folding@home, SETI@home) work for loosely coupled problems, but not for many nuclear/HPC simulations.
- Top500 list is seen as incomplete: Chinese labs and major tech companies often withhold benchmark submissions due to sanctions, secrecy, or lack of incentive.
Historical and Miscellaneous Notes
- Fast Fourier Transform and global seismometer networks were partly driven by nuclear test detection, with significant spillover benefits to geophysics.
- Nostalgia for earlier supercomputers, front-panel lights, and comparison of historical FLOP records with modern consumer devices.