$2 H100s: How the GPU Rental Bubble Burst
Title & pricing confusion
- Many readers initially misread “$2 H100s” as a purchase price, not rental, and called the original title click‑baity.
- Once clarified as ~$2/hour rental, most agreed this is still a sharp drop from ~$8/hour and notable for the market.
Economics of $2/hr (and below)
- Back‑of‑the‑envelope math: at ~$50k per H100, $2/hr implies ~3–3.5 years to break even assuming near‑full utilization, not counting power, networking, rack space, labor, or financing.
- Some argue $2/hr (or promotional ~$1–1.5/hr) must be loss‑making unless hardware or space/power is effectively “free,” or the provider is burning VC money.
- Others counter that once hardware is bought, sunk cost is sunk: renting at or near marginal cost (power, ops) is rational to minimize losses versus idling or fire‑selling.
Capacity, clustering, and real constraints
- Key distinction: single PCIe H100s vs tightly networked SXM/Infiniband clusters. The latter (for large‑scale training) still command higher prices.
- Comments stress that “$2/hr H100” often means constraints: PCIe cards, weaker networking, spot‑style capacity, beta‑quality uptime, or limited user access.
- Some say the real money is in renting whole, reliable multi‑node clusters; cheap single‑GPU time is a side effect of over‑reserved capacity.
Bubble, AI winter, and Nvidia
- Opinions split on whether this signals an AI infra bubble bursting or just a normal boom‑bust supply cycle like oil or crypto mining.
- Some are short or cautious on Nvidia at its current valuation; others argue demand will rebound as cheaper compute spurs more fine‑tuning and inference.
- Several note that next‑gen GPUs (Blackwell, MI300X, etc.) and ongoing foundational‑model races will keep aggregate compute demand high.
Open models, training vs. inference
- Broad agreement that open‑weights models (e.g., Llama families) reduce the need for most companies to train from scratch.
- Thread consensus: only a small number of teams globally need very large H100 clusters for new foundation models; most just fine‑tune and serve inference.
- This shift depresses demand for top‑end training clusters while expanding use of smaller, cheaper GPU setups.
Infrastructure, reliability, and cloud vs. bare metal
- Debate over whether low‑cost providers are inherently unreliable versus “good enough” for checkpointed training workloads.
- Some emphasize that hyperscaler‑style uptime and 45kW+ racks are extremely expensive; others say cloud reliability is oversold and many on‑prem setups could match it.
- Power cost per H100 is relatively small compared to capex and depreciation; rack density, networking, and financing dominate economics.