2026-03-10

Launch HN: RunAnywhere (YC W26) – Faster AI Inference on Apple Silicon

What RunAnywhere / RCLI Provides

Company builds MetalRT, a proprietary inference engine for Apple Silicon, plus RCLI, an open-source CLI demo.
RCLI wires together local speech-to-text (STT), LLM, text-to-speech (TTS), local RAG, and macOS actions into a voice assistant / TUI.
Emphasis on fully local processing and no telemetry by default.

Performance & Model Choices

Benchmarks claim MetalRT is modestly faster than competing Apple-Silicon engines (e.g., MLX, uzu) for 0.6B–4B models and much faster for STT/TTS.
Some see those small models as “toy-sized” and ask for benchmarks on 7B–70B+ models; founders say larger models are on the roadmap.
Commenters note unified memory makes Apple Silicon attractive for very large models; current MetalRT support is focused on latency-sensitive voice pipelines.

Use Cases & Feature Requests

Suggested uses: always-on dictation, virtual audio devices for real-time transcription in video calls, and on-device RAG over sensitive documents.
RCLI supports local RAG with fast hybrid retrieval, and text-only mode (no TTS).
Requests include better quantization formats (e.g., unsloth), richer TTS voices, diarization, Linux support, and SDK access for third-party apps.

Quality & Limitations

Tool-calling with small models is unreliable: commands may be “acknowledged” verbally without the correct macOS action firing.
Team acknowledges this as a core unsolved problem for sub-4B on-device models and plans verification layers and larger models.
Default TTS quality is criticized as dated; better models (e.g., Kokoro) are available but not default.

Installation & Platform Support

Some users report segfaults and Homebrew install issues.
Install script silently installing Homebrew is widely criticized; maintainers agree to change it.
MetalRT currently targets M3/M4; M1/M2 fall back to llama.cpp. Mobile support and other edge devices are planned.

Licensing & Openness

RCLI is MIT-licensed; MetalRT and many models are proprietary.
Some see a closed inference engine as “reinventing the wheel” versus CoreML/MLX; maintainers argue specialization yields higher performance and unified STT/LLM/TTS support.

Security & Trust Concerns

A web demo leaked third-party API keys; initial “bait” response is criticized as flippant, later walked back with an apology and promise to fix.
Separate prior controversy: company scraped GitHub data and used lookalike domains for cold email campaigns. Several commenters say this permanently damaged their trust.

HN Voting & Moderation Discussion

Some users suspect vote manipulation due to fast rise and comment ordering.
Moderation explains: YC “Launch HN” posts receive special placement; off-topic comments (e.g., about past behavior rather than product) are downweighted but not removed.

Related topics