The Prompt API
Model size, storage, and download behavior
- Prompt API requires large on-device models; docs say “at least 22 GB” free space, which many see as excessive for a browser feature.
- Actual model folders reported around 3–4 GB, with speculation that 22 GB is a safety threshold to allow multiple versions and avoid filling disks.
- Models are lazily downloaded on first use, cached once per browser, and shared across sites.
User experience and performance
- Several comments describe slow token generation, heating devices, and long initial download, especially on “baseline” hardware.
- Some would rather pay for fast hosted models than run a sluggish local one; others see local as “good enough” for light tasks like search or summarization.
- There’s concern that low-end models are only useful for trivial or very short interactions.
Privacy, surveillance, and abuse risks
- Some view on-device inference as privacy-preserving; others distrust Chrome/Google and fear background analysis of user data.
- Speculation about covert analytics or wiretap-adjacent uses, though others note this API isn’t required for such behavior.
- Worries about using visitors’ machines for spam or distributed computation; countered by arguments that tiny models and low payoff limit abuse.
Use cases and experiments
- Reported uses include: local search, summarizing hack-day writeups, AI subject-line generation, text adventure modification, AI-based email triage, and potential ad/cookie blockers.
- A large subthread explores “de-snarkifying” social media and comment sections: filtering aggression, summarizing long threads, and stripping clickbait.
- Some welcome this as removing “junk calories”; others fear homogenized “slop” and further detachment from unfiltered reality.
Standardization, browser ecosystem, and fragmentation
- Prompt API currently ties to specific models per browser (e.g., Gemini Nano in Chrome, other models in other browsers).
- Developers worry prompts are highly model-specific and that the API lacks introspection to adapt behavior per browser, making testing harder than with APIs like WebGL.
- Links show mixed reactions from other browser vendors; some detailed, some dismissive.
Local vs cloud models and model quality
- Comparisons claim hosted models (e.g., Gemma via APIs) are faster and more capable than in-browser Gemini Nano.
- Some expect browsers/OSes to eventually ship multiple or better models; others find the prospect of AI baked into OSes/browsers dystopian.