Apple unveils new Mac Studio

Local LLMs and Unified Memory

  • Many see the new Mac Studio (especially M3 Ultra with up to 512 GB unified RAM) as a strong local-LLM box because all memory is high-bandwidth and GPU/Neural Engine–accessible.
  • Others dispute “best in the world,” arguing multi‑GPU PCs (3090/4090/5090) or cloud instances remain better for 70B+ models or serious workloads.
  • First‑hand reports: running ~100 GB models on older Studios yields single‑digit tokens/sec, pushing some back to cloud APIs.
  • Counterexamples: users report >15 tok/s on quantized DeepSeek‑R1 (671B) on M2 Ultra 192 GB, and MoE math suggests ~60 tok/s for R1 Q4_K_M on the new Ultra at 512 GB.
  • Debate over quantization: consensus that it does reduce quality in theory, but often imperceptibly for many use cases and architectures like Q4_K_M can perform surprisingly well.

Bandwidth, Compute, and Diminishing Returns

  • Several note RAM capacity isn’t the main bottleneck; memory bandwidth and raw GPU/CPU throughput are.
  • Critique that Apple didn’t increase bandwidth versus prior Studio, so very large models will see diminishing returns even if they fit in memory.
  • MoE architectures partially mitigate this by only activating a subset of experts per token.

Use Cases and Target Buyers

  • Suggested buyers: AI developers, people wanting always-on local agents, video/photo editors, render farms, high-end creative studios.
  • Some doubt cost-effectiveness versus traditional servers or GPU farms for rendering and databases; server CPUs already support multi‑TB RAM.
  • A few think Apple is simply pre-positioning for the next wave of bigger LLMs.

Web/Electron Bloat and Accessibility

  • The 512 GB headline sparks jokes about needing that much just to run Chrome/Electron/CRUD apps.
  • Underneath the humor: real complaints about browser RAM pressure affecting tools (e.g., builds auto-scaling threads down).
  • Broader discussion about the trend toward canvas/WASM‑based UIs, concerns this worsens accessibility; blind and deaf users describe modern web apps as increasingly unusable.
  • Some are actively working on canvas accessibility and hope for JavaScript-level accessibility APIs.

Pricing, Storage, and Upgradability

  • Strong criticism of Apple SSD pricing (order-of-magnitude over commodity NVMe).
  • Common advice: buy minimal internal storage and use external Thunderbolt NVMe; ugly but far cheaper.
  • Prior Studios and Minis use removable proprietary SSD modules; third‑party upgrades exist, but process is nontrivial and not yet confirmed for new models.
  • RAM pricing (for 512 GB configs) makes fully specced machines ~$14–15k; many see this as out of reach for individuals.

Chip Choices and Naming Confusion

  • Confusion and annoyance that the high-end option is now “M3 Ultra” (older gen, more compute) while the other is “M4 Max” (newer gen, less compute).
  • Some argue Apple’s scheme (generation + Pro/Max/Ultra tiers) is still clearer than Intel/AMD naming; others find “Max” being weaker than “Ultra” counterintuitive.
  • Buyers express frustration at lack of clear, Apple-provided comparisons between M3 Ultra and M4 Max by workload (single-threaded vs. massively parallel, media vs. AI).

Design Details: Power Button and Expansion

  • Minor but recurring gripe: Studio’s rear power button and Mini’s bottom power button are annoying for shared/lab machines that are often shut down.
  • Others respond that Macs are meant to sleep, not power off; energy use at idle is claimed to be negligible, though some still prefer full shutdown for security or habit.
  • Thunderbolt 5 + external PCIe chassis is highlighted as Apple’s answer to “expansion,” but GPU support over TB on Apple Silicon remains unclear and niche.