2026-06-08

Apple Core AI Framework

Core AI vs Core ML vs MLX

Core AI appears to be Apple’s new primary framework for neural networks and transformers, with a new .aimodel format and profiler.
Core ML is now positioned mainly for “classic” ML: decision trees, tabular features, and non-neural models; it remains necessary for older OSes and devices (Core AI requires OS 27+).
MLX is described as a bring-your-own-weights, research/experimentation stack that does not access the Neural Engine (ANE) and isn’t aimed at end‑user deployment.
Some participants find Apple’s segmentation (Core AI, Core ML, MLX, coreai-opt) confusing and want clearer feature‑parity docs.

Performance, ANE, and deployment

Core AI promises efficient use of CPU, GPU, and ANE; some wonder how it compares to current Metal-optimized / llama.cpp setups.
Prior reverse-engineering of private ANE APIs is cited as evidence that Apple still has untapped performance headroom versus existing public frameworks.
ANE has already been usable via Core ML for years; MLX still cannot use it.

On-device foundation models and Apple tools

Strong enthusiasm for Apple’s on-device foundation models and the new fm command-line tool (including a local “Chat Completions” API server).
Developers see appeal in a system-wide, on-device model exposed via OS APIs, plus free “private cloud compute” for smaller apps with server-grade privacy guarantees.
Some worry about limits, monetization, and whether Apple will support fully OpenAPI-compatible endpoints.

Local vs cloud AI and scaling debate

One group claims AI will largely move on-device, with “infinite tokens” on consumer hardware and shrinking need for centralized providers.
Others argue large frontier models still outperform small ones, remain too heavy for common devices, and are currently cheaper to access via cloud.
There is disagreement over whether model scaling is hitting limits or just encountering cost/latency constraints.
Many report that ~30–40B open models (Qwen, Gemma, etc.) already deliver “good enough” results for a lot of coding and agentic tasks on desktops, though not yet at top-tier frontier quality.

Cross‑platform frameworks and ecosystem

On Linux and non‑Apple platforms, no Core-AI‑like standard exists; developers juggle multiple stacks (CUDA, onnxruntime, llama.cpp, vendor NPUs).
Participants expect continued fragmentation, with Apple now adding yet another distinct stack but also enabling features like distributed inference across Macs (e.g., over Thunderbolt).

Related topics