2025-01-29

On DeepSeek and export controls

Technical claims and cost comparisons

Commenters focus on the blog’s new detail that Claude 3.5 Sonnet cost “a few tens of millions” to train and was not distilled from a larger model; this contradicts prior rumors and surprises many.
Several people contest the author’s framing that DeepSeek “did not do for $6M what cost US companies billions.”
- Even taking his numbers, they see a 3–10x training cost gap and note DeepSeek’s model appears similarly capable while being far cheaper to run.
- Users highlight that DeepSeek’s inference cost is reportedly 15–50x lower than comparable US APIs, and question whether US labs simply run at high margins or lack comparable optimization.
Some argue DeepSeek’s methods (MoE, PTX tuning) are not magic but expected steps on a general cost curve; others counter that constrained Chinese hardware gave DeepSeek strong incentives to push memory and efficiency innovations (MLA, FP8, scheduling).

Export controls, chips, and zero‑sum dynamics

Many see export controls as shortsighted: China is expected to reach domestic chip parity or near‑parity soon, and tighter controls may accelerate Huawei/Ascend and other Nvidia competitors.
Others argue controls are rational “lesser evil”: hostile states will use any advantage; limiting training‑grade chips slows their military AI, even if only temporarily.
There is debate whether the chip market is effectively zero‑sum while leading fabs run near capacity, making “each chip to China” one not available to US labs.
Some note DeepSeek already running on non‑US hardware complicates the export‑control narrative.

Geopolitics, morality, and AI power

The article’s call for US/allied AI dominance and fears of Chinese military applications is widely criticized as self‑serving, nationalist, or “Cold War‑style” rhetoric likely to be self‑fulfilling.
Several point out US human‑rights abuses and military interventions, rejecting a simple “democracies good, China bad” framing; others still prefer US hegemony over Chinese.
There is extensive argument over unipolar vs multipolar worlds, historical war patterns, and whether US export controls are about democracy or raw trade power.
Some worry chip and AI controls could extend to consumer hardware in future; others respond that current regimes focus on training, not inference.

Race dynamics, regulation, and incentives

Multiple commenters see the piece as an attempt by a major US lab to lobby for regulations that entrench incumbents and create a moat against cheaper, open‑weights competitors.
Others welcome that DeepSeek’s open release is forcing US labs to reveal more about training costs and methods.
The blog’s casual prediction of near‑term “superhuman at almost all things” AI (2026–2027) is met with skepticism, or dismissed as vested‑interest hype.

Related topics