Nvidia’s $589B DeepSeek rout

Market reaction and stock moves

  • Nvidia and other “AI trade” stocks dropped sharply; ASML also fell, which many see as an overreaction or narrative-driven rather than fundamentals-based.
  • Some view this as the AI bubble finally deflating or at least a correction of “priced for perfection” valuations; others compare it to dotcom-era volatility where tech was real but timelines and moats were mispriced.
  • Several comments stress that markets are largely a beauty contest of expectations about expectations, not a clean reflection of real-world AI demand or Nvidia’s current business.

DeepSeek’s claims, verification, and skepticism

  • DeepSeek reports training a frontier-scale reasoning model for roughly $6M on H800s, with detailed papers and open weights.
  • Skeptics question whether training cost or hardware access are understated, or whether this is politically motivated PR; some suspect unreported H100 clusters or hidden subsidies.
  • Others check FLOPs, architecture, tokens, and MFU and argue the numbers basically add up; early replications (including small-scale Berkeley work and live Hugging Face efforts) support genuine efficiency gains, at least for smaller models.
  • Key nuance: the $6M figure is for V3 pretraining; total R1 cost isn’t fully disclosed, and much of the gain appears to come from architectural and low-level engineering innovations, not magic.

Consequences for Nvidia, GPUs, and data centers

  • Bear case: if you can match o1‑like performance with ~10–50× less compute, hyperscalers’ mega-capex and Nvidia’s extreme margins look less justifiable; Nvidia’s valuation assumed continued exponential GPU demand and lack of real alternatives.
  • Bull case: Jevons paradox—cheaper intelligence increases total AI usage, expands the customer base beyond a handful of hyperscalers, and still leaves training and reasoning heavily compute-bound; more efficient techniques can be applied on even larger clusters.
  • Additional concern: if smaller or non‑US players can do frontier-ish work on older or commodity hardware, Nvidia’s pricing power and “only game in town” narrative weaken, even if unit demand stays high.

Impact on OpenAI/Anthropic and foundation-model economics

  • Many think the real losers are closed, capital‑intensive labs whose moat was “only we can afford to train frontier models on vast GPU farms.”
  • Distillation and cheap replication of reasoning models compress prices and erode the “rent-seeking” thesis that justified huge private valuations and projects like Stargate.
  • The consensus is shifting toward foundation models being fungible and commoditizable; value migrates to interfaces, integration, data ownership, and distribution (e.g., hyperscalers, incumbents like Meta, cloud platforms).

China, export controls, and geopolitics

  • DeepSeek is widely read as proof that export controls and H800 downgrades did not prevent China from reaching near‑frontier performance and may even have forced more aggressive efficiency work (PTX-level optimizations, bandwidth-aware architectures).
  • Some argue Chinese AI companies may be using smuggled high-end GPUs; others note the political incentives to under‑report capabilities or to time announcements for maximum geopolitical and market impact.
  • Several commenters predict growing Chinese capability in GPUs, HBM, and lithography, potentially challenging Nvidia and ASML over a 5‑10 year horizon.

Open models, IP, and legal/ethical side threads

  • The discussion revisits whether training on copyrighted data is unlawful or fair use, and whether LLMs “contain” verbatim works when they can output scripts on demand.
  • DeepSeek’s openness (papers + weights) is contrasted with closed US labs; some see it as reviving the older norm of publishing major advances, others as a prestige or geopolitical move.
  • There is broad agreement that open weights and reproducible recipes make it hard for any one lab to sustain a durable moat purely on model training.