Ask HN: Why hasn’t AMD made a viable CUDA alternative?
Perceived Root Causes
- Many argue this is primarily a management/strategy failure: AMD has not treated software as first‑class, nor made GPU compute a top priority compared to CPUs and consoles.
- Others stress history and timing: near‑bankruptcy around 2015, focus on winning console and CPU battles, and a bet on OpenCL that failed left them under-resourced and late.
Software Stack: ROCm, HIP, OpenCL
- AMD does have a CUDA‑like language (HIP) and the ROCm stack, plus some emerging libraries, but:
- Early ROCm was seen as awful to use; that reputation stuck.
- Support is fragmented: only a few GPU SKUs are “official,” others “almost work” with hacks.
- Documentation, tooling, and stability are widely criticized.
- OpenCL was supposed to be the open standard, but lost to CUDA due to worse ergonomics, weaker documentation/community, and poor vendor follow‑through (including AMD’s).
- Some say AMD over-relied on “open source will fix it” instead of funding a first‑class developer experience.
Hardware, Drivers, and Platform Issues
- Reports of buggy, bloated drivers and painful setup (e.g., for llama.cpp) contrast with Nvidia “just works.”
- PCIe atomics and motherboard firmware incompatibilities create nondeterministic ROCm failures; users can’t easily know if their board truly supports what ROCm needs.
- Others note architectural/firmware differences: Nvidia offloads more to updatable on‑card firmware, making long‑term support easier.
CUDA’s Ecosystem and Network Effects
- CUDA is described as an ecosystem: mature libraries (cuDNN, cuBLAS, NCCL, etc.), tools, examples, and extensive outreach (on‑site engineers, hackathons).
- Its “moat” is seen less as the core language and more as completeness, stability, and continuity across generations.
- Counterpoint: many ML users write little or no CUDA, relying on PyTorch/TensorFlow. If those frameworks run well on AMD, CUDA’s lock‑in weakens.
Leadership, Risk, and Investment Constraints
- Debate over leadership style: AMD leadership is portrayed as more conservative, incremental, and beholden to a board versus Nvidia’s founder‑CEO willing to make huge, long‑horizon bets.
- Several comments argue AMD simply hasn’t spent the billions and recruited the thousands of top engineers needed; efforts are “tens of millions” instead of “billions.”
Market Dynamics and Economics
- AMD’s core wins have been in CPUs and consoles; the GPU compute market (segment “3”) only exploded recently.
- Some argue it hasn’t been economically rational for AMD to chase Nvidia into a segment where Nvidia enjoys ~80% margins and near-total mindshare.
- Others counter that those margins show there is ample room for a strong challenger, and that real competition would dramatically lower AI compute costs.
Current Efforts and Glimmers of Hope
- ROCm/HIP have improved; people report working setups on recent APUs/GPUs and growing PyTorch ROCm support.
- Third‑party projects like ZLUDA and Scale aim to run CUDA binaries/code on AMD via HIP/ROCm.
- Tinygrad and related community work are seen by some as a promising “beyond CUDA” path, though others are skeptical of their maturity and impact.
Proposed Paths Forward
- Commonly suggested moves for AMD:
- Treat software as co‑equal with hardware; build a strong, empowered software org.
- Provide full ROCm support across all modern GPUs and certify motherboards (“ROCm compatible”).
- Open drivers more fully and collaborate closely with flagship open‑source projects (PyTorch, llama.cpp, etc.).
- Differentiate on hardware with much larger, affordable VRAM pools to attract AI users despite weaker software.