Handy – Free open source speech-to-text app
UI, accessibility, and CLI vs GUI
- Some questioned why a GUI is needed; responses stressed accessibility to non-technical users and ease of installation (especially on macOS/Linux).
- A separate CLI version exists and is used for automation / shell workflows.
- Users praise the minimal, “obvious” UI and history view; one finds another app’s UI “too complicated” by comparison.
Models, speed, and local processing
- Parakeet V3 is repeatedly praised as “incredibly fast” and highly accurate, often beating built‑in macOS dictation and other tools.
- Handy runs fully locally, leveraging GPU where available; users value this both for privacy and avoiding ongoing costs.
- “Discharging the model” simply unloads it from RAM, trading memory for slower cold starts.
Features, post‑processing, and limitations
- Desired features: custom dictionary / replacements for domain terms, confidence indicators on words, ability to edit or correct already typed text, direct piping to tools like Claude Code, meeting transcription, API access, iOS/mobile apps, and an option to keep no history (currently in a debug menu).
- Handy supports custom words, built‑in dictionary, and experimental LLM post‑processing (hidden in a debug menu).
- Bluetooth mics (e.g., AirPods) introduce 1–2s start lag; internal laptop mics work better. Latency here is a common complaint.
- There’s a hotkey pitfall: default Ctrl+Space can emit control characters if key‑up timing is unlucky (e.g., in Emacs).
Use cases and impact on workflows
- Users employ Handy for: talking to coding agents/LLMs, writing Word comments/feedback, general dictation, and replacing Superwhisper/MacWhisper for accessibility (e.g., dystonia).
- Some find speech faster than typing, especially when multitasking; others say they think/type faster and struggle to dictate fluently.
- Discussion extends to “next‑level” workflows: feeding STT into LLM agents to execute commands, manipulate GUIs, or perform “coding by voice,” with references to prior and ongoing work and a related tool that records multimodal context for agents.
Comparisons and ecosystem
- Handy is compared with Superwhisper, Wispr Flow, open‑whispr, WhisperTux, MacWhisper, FluidVoice, Hex, VoiceInk, and several mobile apps (Spokenly, Futo keyboard, Android Parakeet apps).
- Many report Handy as at least competitive in accuracy/speed, with the main differentiators being UI, pricing (Handy is free/open), and real‑time vs batch transcription.
- macOS Dictation is widely described as unreliable for accents, noisy environments, and technical terms.