Apple Core AI Framework

Core AI vs Core ML vs MLX

  • Core AI appears to be Apple’s new primary framework for neural networks and transformers, with a new .aimodel format and profiler.
  • Core ML is now positioned mainly for “classic” ML: decision trees, tabular features, and non-neural models; it remains necessary for older OSes and devices (Core AI requires OS 27+).
  • MLX is described as a bring-your-own-weights, research/experimentation stack that does not access the Neural Engine (ANE) and isn’t aimed at end‑user deployment.
  • Some participants find Apple’s segmentation (Core AI, Core ML, MLX, coreai-opt) confusing and want clearer feature‑parity docs.

Performance, ANE, and deployment

  • Core AI promises efficient use of CPU, GPU, and ANE; some wonder how it compares to current Metal-optimized / llama.cpp setups.
  • Prior reverse-engineering of private ANE APIs is cited as evidence that Apple still has untapped performance headroom versus existing public frameworks.
  • ANE has already been usable via Core ML for years; MLX still cannot use it.

On-device foundation models and Apple tools

  • Strong enthusiasm for Apple’s on-device foundation models and the new fm command-line tool (including a local “Chat Completions” API server).
  • Developers see appeal in a system-wide, on-device model exposed via OS APIs, plus free “private cloud compute” for smaller apps with server-grade privacy guarantees.
  • Some worry about limits, monetization, and whether Apple will support fully OpenAPI-compatible endpoints.

Local vs cloud AI and scaling debate

  • One group claims AI will largely move on-device, with “infinite tokens” on consumer hardware and shrinking need for centralized providers.
  • Others argue large frontier models still outperform small ones, remain too heavy for common devices, and are currently cheaper to access via cloud.
  • There is disagreement over whether model scaling is hitting limits or just encountering cost/latency constraints.
  • Many report that ~30–40B open models (Qwen, Gemma, etc.) already deliver “good enough” results for a lot of coding and agentic tasks on desktops, though not yet at top-tier frontier quality.

Cross‑platform frameworks and ecosystem

  • On Linux and non‑Apple platforms, no Core-AI‑like standard exists; developers juggle multiple stacks (CUDA, onnxruntime, llama.cpp, vendor NPUs).
  • Participants expect continued fragmentation, with Apple now adding yet another distinct stack but also enabling features like distributed inference across Macs (e.g., over Thunderbolt).