$2 H100s: How the GPU Rental Bubble Burst

Title & pricing confusion

  • Many readers initially misread “$2 H100s” as a purchase price, not rental, and called the original title click‑baity.
  • Once clarified as ~$2/hour rental, most agreed this is still a sharp drop from ~$8/hour and notable for the market.

Economics of $2/hr (and below)

  • Back‑of‑the‑envelope math: at ~$50k per H100, $2/hr implies ~3–3.5 years to break even assuming near‑full utilization, not counting power, networking, rack space, labor, or financing.
  • Some argue $2/hr (or promotional ~$1–1.5/hr) must be loss‑making unless hardware or space/power is effectively “free,” or the provider is burning VC money.
  • Others counter that once hardware is bought, sunk cost is sunk: renting at or near marginal cost (power, ops) is rational to minimize losses versus idling or fire‑selling.

Capacity, clustering, and real constraints

  • Key distinction: single PCIe H100s vs tightly networked SXM/Infiniband clusters. The latter (for large‑scale training) still command higher prices.
  • Comments stress that “$2/hr H100” often means constraints: PCIe cards, weaker networking, spot‑style capacity, beta‑quality uptime, or limited user access.
  • Some say the real money is in renting whole, reliable multi‑node clusters; cheap single‑GPU time is a side effect of over‑reserved capacity.

Bubble, AI winter, and Nvidia

  • Opinions split on whether this signals an AI infra bubble bursting or just a normal boom‑bust supply cycle like oil or crypto mining.
  • Some are short or cautious on Nvidia at its current valuation; others argue demand will rebound as cheaper compute spurs more fine‑tuning and inference.
  • Several note that next‑gen GPUs (Blackwell, MI300X, etc.) and ongoing foundational‑model races will keep aggregate compute demand high.

Open models, training vs. inference

  • Broad agreement that open‑weights models (e.g., Llama families) reduce the need for most companies to train from scratch.
  • Thread consensus: only a small number of teams globally need very large H100 clusters for new foundation models; most just fine‑tune and serve inference.
  • This shift depresses demand for top‑end training clusters while expanding use of smaller, cheaper GPU setups.

Infrastructure, reliability, and cloud vs. bare metal

  • Debate over whether low‑cost providers are inherently unreliable versus “good enough” for checkpointed training workloads.
  • Some emphasize that hyperscaler‑style uptime and 45kW+ racks are extremely expensive; others say cloud reliability is oversold and many on‑prem setups could match it.
  • Power cost per H100 is relatively small compared to capex and depreciation; rack density, networking, and financing dominate economics.