Taking on CUDA with ROCm: 'One Step After Another'

Overall sentiment on ROCm vs CUDA

  • Many see AMD as years behind CUDA, due to historic underinvestment and lack of vision despite early AI signals.
  • Several comment that AMD’s ROCm stack feels like it’s still in a “plan/cleanup” phase while CUDA is a mature, feature‑rich ecosystem.
  • Some think AMD can still succeed in data centers even if ROCm never fully catches CUDA, but most argue the software gap is now a critical competitive issue.

Hardware support, lifecycle, and value

  • Major frustration: ROCm’s limited and shifting hardware support, especially for consumer GPUs and APUs.
    • Only recent RDNA generations are officially supported; older and even high‑end RDNA2 cards are often left behind.
    • “Unofficial” workarounds sometimes function, but can break with kernel/driver updates.
  • ROCm is seen as “mayfly‑lifetime” compared to CUDA’s long support window for older NVIDIA GPUs.
  • On price/perf, AMD and Intel cards (including new Radeon Pro and Arc Pro) can be very attractive, but support headaches make some users regret not buying NVIDIA.
  • Several users report good local LLM performance on new AMD cards (especially RDNA4), but describe setup as fiddly.

Developer experience and ecosystem

  • ROCm is criticized as buggy, hard to install, poorly packaged, and slow to support popular frameworks and features.
  • Compared to CUDA, AMD lacks:
    • Rich, polyglot libraries.
    • First‑class IDE/debugger tooling.
    • Smooth experiences in PyTorch, vLLM, etc. (though things are improving).
  • Vulkan backends often “just work” and can match or beat ROCm in some LLM workloads, but Vulkan is viewed as low‑level, verbose, and ergonomically painful.

Open source, trust, and corporate culture

  • ROCm’s openness is praised (fully open userspace, community projects like TheRock), but undermined by:
    • Narrow official device support.
    • Complex, brittle build systems (especially on musl/Alpine).
  • NVIDIA’s proprietary stack is criticized on transparency and FOSS grounds, but many note the market overwhelmingly prioritizes performance, features, and stability over openness.
  • Internal AMD bureaucracy and conservative open‑source policies are described as serious drags on progress.

Proposed directions and alternatives

  • Calls for:
    • Supporting every new AMD GPU/APU with ROCm at launch.
    • Longer support windows for consumer hardware.
    • First‑class C/C++/Fortran (HIP) alongside Triton/Python, not just AI‑centric paths.
    • Better packaging in mainstream distros.
  • Alternatives discussed:
    • Vulkan and SYCL/oneAPI, OpenVINO on Intel.
    • Rust‑GPU, Triton, higher‑level abstractions.
    • Using LLMs/agents and RL to help port CUDA code, though current models are seen as not yet capable of this at scale.