Apple unveils new Mac Studio
Local LLMs and Unified Memory
- Many see the new Mac Studio (especially M3 Ultra with up to 512 GB unified RAM) as a strong local-LLM box because all memory is high-bandwidth and GPU/Neural Engine–accessible.
- Others dispute “best in the world,” arguing multi‑GPU PCs (3090/4090/5090) or cloud instances remain better for 70B+ models or serious workloads.
- First‑hand reports: running ~100 GB models on older Studios yields single‑digit tokens/sec, pushing some back to cloud APIs.
- Counterexamples: users report >15 tok/s on quantized DeepSeek‑R1 (671B) on M2 Ultra 192 GB, and MoE math suggests ~60 tok/s for R1 Q4_K_M on the new Ultra at 512 GB.
- Debate over quantization: consensus that it does reduce quality in theory, but often imperceptibly for many use cases and architectures like Q4_K_M can perform surprisingly well.
Bandwidth, Compute, and Diminishing Returns
- Several note RAM capacity isn’t the main bottleneck; memory bandwidth and raw GPU/CPU throughput are.
- Critique that Apple didn’t increase bandwidth versus prior Studio, so very large models will see diminishing returns even if they fit in memory.
- MoE architectures partially mitigate this by only activating a subset of experts per token.
Use Cases and Target Buyers
- Suggested buyers: AI developers, people wanting always-on local agents, video/photo editors, render farms, high-end creative studios.
- Some doubt cost-effectiveness versus traditional servers or GPU farms for rendering and databases; server CPUs already support multi‑TB RAM.
- A few think Apple is simply pre-positioning for the next wave of bigger LLMs.
Web/Electron Bloat and Accessibility
- The 512 GB headline sparks jokes about needing that much just to run Chrome/Electron/CRUD apps.
- Underneath the humor: real complaints about browser RAM pressure affecting tools (e.g., builds auto-scaling threads down).
- Broader discussion about the trend toward canvas/WASM‑based UIs, concerns this worsens accessibility; blind and deaf users describe modern web apps as increasingly unusable.
- Some are actively working on canvas accessibility and hope for JavaScript-level accessibility APIs.
Pricing, Storage, and Upgradability
- Strong criticism of Apple SSD pricing (order-of-magnitude over commodity NVMe).
- Common advice: buy minimal internal storage and use external Thunderbolt NVMe; ugly but far cheaper.
- Prior Studios and Minis use removable proprietary SSD modules; third‑party upgrades exist, but process is nontrivial and not yet confirmed for new models.
- RAM pricing (for 512 GB configs) makes fully specced machines ~$14–15k; many see this as out of reach for individuals.
Chip Choices and Naming Confusion
- Confusion and annoyance that the high-end option is now “M3 Ultra” (older gen, more compute) while the other is “M4 Max” (newer gen, less compute).
- Some argue Apple’s scheme (generation + Pro/Max/Ultra tiers) is still clearer than Intel/AMD naming; others find “Max” being weaker than “Ultra” counterintuitive.
- Buyers express frustration at lack of clear, Apple-provided comparisons between M3 Ultra and M4 Max by workload (single-threaded vs. massively parallel, media vs. AI).
Design Details: Power Button and Expansion
- Minor but recurring gripe: Studio’s rear power button and Mini’s bottom power button are annoying for shared/lab machines that are often shut down.
- Others respond that Macs are meant to sleep, not power off; energy use at idle is claimed to be negligible, though some still prefer full shutdown for security or habit.
- Thunderbolt 5 + external PCIe chassis is highlighted as Apple’s answer to “expansion,” but GPU support over TB on Apple Silicon remains unclear and niche.