2024-04-16

AMD unveils Ryzen Pro 8000-series processors

Apple M-Series, Unified Memory, and LLMs

Unified on-package memory is praised for letting Macs load very large LLMs; high-memory Macs are popular in the local-LLM community.
Multiple commenters stress this doesn’t make M-series “magic”: when models fit in GPU VRAM, high-end Nvidia GPUs are still much faster.
Key benefit of M-series: you can get 64–192 GB of GPU-addressable memory in a laptop/desktop, which consumer GPUs can’t match in VRAM.
Some find M1/M3 performance “solid” for interactive LLM work; others call it “abysmal” vs modern Nvidia GPUs and a missed opportunity.

GPUs vs CPUs/APUs for Local LLMs

Nvidia 30/40-series GPUs vastly outperform M-series or APUs on tokens/s when the model fits in VRAM, due to much higher memory bandwidth.
However, many consumer GPUs have too little VRAM for modern 70B+ models, forcing heavy quantization or offloading to slow system RAM.
Several argue that for “just experimenting” or conversational speeds, a mid-range CPU with lots of cheap RAM is adequate.

Memory Bandwidth, TOPS, and NPU Limits

For LLMs, commenters emphasize memory bandwidth, not TOPS, as the dominant bottleneck.
Apple M2/M3 bandwidth is high but still below top GPUs; DDR5 in typical PCs is far lower, constraining APUs and NPUs.
NPUs with ~16 TOPS are considered insufficient for high-performance LLM inference; demo numbers like ~8 tokens/s for LLaMA 7B are called underwhelming.

Usefulness and Maturity of Ryzen AI / NPUs

Several see the Ryzen AI NPU as early, poorly supported silicon:
- Tooling not yet integrated into mainstream frameworks.
- Some laptops ship with NPUs disabled in firmware.
Others note ONNX/VitisAI support exists and stress NPUs are aimed at low-power, always-on tasks (e.g., background removal, video processing), not large LLMs.

Edge / On-Device AI Use Cases

Proposed “killer apps”: local upscaling, offline STT/TTS, better webcam effects, audio cleanup, local document-aware assistants, and game AI.
Skeptics reply that many of these already run on GPUs/CPUs, and desktop demand (vs mobile) is unclear; defenders highlight power savings and accessibility benefits.

Ryzen Pro 8000 Positioning and Platform Details

“Pro” variants add enterprise features (remote management, security, ECC UDIMM support) and target OEM/enterprise “commercial” markets, not DIY retail.
Discussion notes 8-core APU limit (vs 12–16 core desktop Ryzens) as a power/thermals trade-off; integrated GPUs and monolithic dies help idle power, useful for servers/NAS.

Related topics