2025-03-05

Apple unveils new Mac Studio

Local LLMs and Unified Memory

Many see the new Mac Studio (especially M3 Ultra with up to 512 GB unified RAM) as a strong local-LLM box because all memory is high-bandwidth and GPU/Neural Engine–accessible.
Others dispute “best in the world,” arguing multi‑GPU PCs (3090/4090/5090) or cloud instances remain better for 70B+ models or serious workloads.
First‑hand reports: running ~100 GB models on older Studios yields single‑digit tokens/sec, pushing some back to cloud APIs.
Counterexamples: users report >15 tok/s on quantized DeepSeek‑R1 (671B) on M2 Ultra 192 GB, and MoE math suggests ~60 tok/s for R1 Q4_K_M on the new Ultra at 512 GB.
Debate over quantization: consensus that it does reduce quality in theory, but often imperceptibly for many use cases and architectures like Q4_K_M can perform surprisingly well.

Bandwidth, Compute, and Diminishing Returns

Several note RAM capacity isn’t the main bottleneck; memory bandwidth and raw GPU/CPU throughput are.
Critique that Apple didn’t increase bandwidth versus prior Studio, so very large models will see diminishing returns even if they fit in memory.
MoE architectures partially mitigate this by only activating a subset of experts per token.

Use Cases and Target Buyers

Suggested buyers: AI developers, people wanting always-on local agents, video/photo editors, render farms, high-end creative studios.
Some doubt cost-effectiveness versus traditional servers or GPU farms for rendering and databases; server CPUs already support multi‑TB RAM.
A few think Apple is simply pre-positioning for the next wave of bigger LLMs.

Web/Electron Bloat and Accessibility

The 512 GB headline sparks jokes about needing that much just to run Chrome/Electron/CRUD apps.
Underneath the humor: real complaints about browser RAM pressure affecting tools (e.g., builds auto-scaling threads down).
Broader discussion about the trend toward canvas/WASM‑based UIs, concerns this worsens accessibility; blind and deaf users describe modern web apps as increasingly unusable.
Some are actively working on canvas accessibility and hope for JavaScript-level accessibility APIs.

Pricing, Storage, and Upgradability

Strong criticism of Apple SSD pricing (order-of-magnitude over commodity NVMe).
Common advice: buy minimal internal storage and use external Thunderbolt NVMe; ugly but far cheaper.
Prior Studios and Minis use removable proprietary SSD modules; third‑party upgrades exist, but process is nontrivial and not yet confirmed for new models.
RAM pricing (for 512 GB configs) makes fully specced machines ~$14–15k; many see this as out of reach for individuals.

Chip Choices and Naming Confusion

Confusion and annoyance that the high-end option is now “M3 Ultra” (older gen, more compute) while the other is “M4 Max” (newer gen, less compute).
Some argue Apple’s scheme (generation + Pro/Max/Ultra tiers) is still clearer than Intel/AMD naming; others find “Max” being weaker than “Ultra” counterintuitive.
Buyers express frustration at lack of clear, Apple-provided comparisons between M3 Ultra and M4 Max by workload (single-threaded vs. massively parallel, media vs. AI).

Design Details: Power Button and Expansion

Minor but recurring gripe: Studio’s rear power button and Mini’s bottom power button are annoying for shared/lab machines that are often shut down.
Others respond that Macs are meant to sleep, not power off; energy use at idle is claimed to be negligible, though some still prefer full shutdown for security or habit.
Thunderbolt 5 + external PCIe chassis is highlighted as Apple’s answer to “expansion,” but GPU support over TB on Apple Silicon remains unclear and niche.

Related topics