On DeepSeek and export controls

Technical claims and cost comparisons

  • Commenters focus on the blog’s new detail that Claude 3.5 Sonnet cost “a few tens of millions” to train and was not distilled from a larger model; this contradicts prior rumors and surprises many.
  • Several people contest the author’s framing that DeepSeek “did not do for $6M what cost US companies billions.”
    • Even taking his numbers, they see a 3–10x training cost gap and note DeepSeek’s model appears similarly capable while being far cheaper to run.
    • Users highlight that DeepSeek’s inference cost is reportedly 15–50x lower than comparable US APIs, and question whether US labs simply run at high margins or lack comparable optimization.
  • Some argue DeepSeek’s methods (MoE, PTX tuning) are not magic but expected steps on a general cost curve; others counter that constrained Chinese hardware gave DeepSeek strong incentives to push memory and efficiency innovations (MLA, FP8, scheduling).

Export controls, chips, and zero‑sum dynamics

  • Many see export controls as shortsighted: China is expected to reach domestic chip parity or near‑parity soon, and tighter controls may accelerate Huawei/Ascend and other Nvidia competitors.
  • Others argue controls are rational “lesser evil”: hostile states will use any advantage; limiting training‑grade chips slows their military AI, even if only temporarily.
  • There is debate whether the chip market is effectively zero‑sum while leading fabs run near capacity, making “each chip to China” one not available to US labs.
  • Some note DeepSeek already running on non‑US hardware complicates the export‑control narrative.

Geopolitics, morality, and AI power

  • The article’s call for US/allied AI dominance and fears of Chinese military applications is widely criticized as self‑serving, nationalist, or “Cold War‑style” rhetoric likely to be self‑fulfilling.
  • Several point out US human‑rights abuses and military interventions, rejecting a simple “democracies good, China bad” framing; others still prefer US hegemony over Chinese.
  • There is extensive argument over unipolar vs multipolar worlds, historical war patterns, and whether US export controls are about democracy or raw trade power.
  • Some worry chip and AI controls could extend to consumer hardware in future; others respond that current regimes focus on training, not inference.

Race dynamics, regulation, and incentives

  • Multiple commenters see the piece as an attempt by a major US lab to lobby for regulations that entrench incumbents and create a moat against cheaper, open‑weights competitors.
  • Others welcome that DeepSeek’s open release is forcing US labs to reveal more about training costs and methods.
  • The blog’s casual prediction of near‑term “superhuman at almost all things” AI (2026–2027) is met with skepticism, or dismissed as vested‑interest hype.