We were wrong about GPUs
Nvidia, Virtualization, and Why GPUs Were Hard on Fly
- Several comments dig into Fly’s technical story: Nvidia’s vGPU licensing and “phone‑home” checks don’t mesh with Fly’s fast‑start microVM model.
- MIG is described as paravirtualized and tied to Nvidia’s userland stack, not clean PCI devices, making secure cross‑VM sharing difficult without heavy custom work.
- Ideas like virtio‑cuda, using Nvidia’s vCS via QEMU, or disaggregated emulation are discussed, but generally seen as high‑maintenance and possibly in conflict with Nvidia’s terms.
- Some argue QEMU startup cost is overstated and that Fly’s Cloud Hypervisor work essentially rebuilt similar VFIO‑style plumbing.
Mismatch Between Fly’s Users and GPU Demand
- A recurring theme: Fly’s core audience wants a PaaS‑like “git push” DX, not low‑level GPU primitives.
- Commenters say GPU buyers either want: a) big, dedicated clusters for heavy training/inference, or b) fully managed LLM APIs. Fly sits awkwardly between.
- People note that customers who pay hyperscaler‑level GPU prices usually prefer hyperscalers or specialist GPU clouds, not a mid‑tier app platform.
Cost, Reliability, and Alternatives
- Hobbyists and small teams largely find Fly (and its GPUs) too expensive versus homelabs, cheap VPSes, or dedicated servers; GPU marketplaces like Runpod, Vast, Voltage Park, and others are frequently cited.
- Some praise Fly’s GPU DX (fast on‑demand machines, simple CLI) but say ongoing costs and storage pricing make continuous or casual use hard to justify.
- There is skepticism about Fly’s overall reliability history; Fly staff claim it has improved and emphasize autosuspend/auto‑stop as key to cost control.
Do Developers Want GPUs or Just LLMs?
- Many agree with the article’s claim that most developers “want LLMs, not GPUs”: they’d rather call OpenAI/Anthropic/Cloudflare Workers AI than manage drivers, models, and cold starts.
- Others push back, citing non‑LLM GPU use (vision, “classic” ML, data science) and open‑source LLM self‑hosting as real but more niche workloads.
- There’s broad agreement that GPU serverless suffers from long cold starts and that today’s API pricing and performance are “good enough” for many apps.
Fly’s Positioning and Takeaways
- Several commenters say the outcome was predictable: Fly’s brand and DX attract app developers, not infra buyers; succeeding in GPUs would require a different product and focus.
- Others think Fly exited too early, arguing demand for simpler private LLM and ML pipelines is only beginning.
- The candid “we were wrong” post is widely respected, but many frame this as a classic product‑market fit miss, not a verdict on cloud GPUs in general.