2026-06-25

The unbearable cheapness of open weight models

Economics of Open-Weight vs Closed Models

Many argue inference is already “dirt cheap” and getting cheaper; training and RLHF/RLAIF are the real cost drivers.
Open-weight providers (e.g., DeepSeek, Xiaomi) are seen as lacking brand/technical moat, so they must price at commodity levels.
Others question sustainability of ultra-low prices, citing GPU depreciation and claims that some providers’ margins look unrealistic.
There’s disagreement on whether frontier labs’ inference margins are ~50% (constrained by costs) or >90% (with cash burn mostly from training and org bloat).
Subscriptions vs API: API tokens look expensive; flat-rate plans likely subsidized, but some think they’re still profitable due to unused quota.

Business Models and Moats for Frontier Labs

Concern that open-weight models will commoditize “95% of tasks” (coding, research, cowork-style work), dragging down prices.
Proposed survival paths:
- Push frontier capabilities (AGI, advanced science, biotech, nuclear design) and take royalties.
- Own the application layer (enterprise tools, vertical products) and sell “best in class, safe, compliant” to large orgs.
- Regulatory capture: “certified/regulated models only,” federal procurement rules, copyright licenses as gatekeeping.
- Temporarily buying up scarce hardware (RAM/GPUs) to delay competitors until IPO.
Skeptics doubt their app-design chops and note incumbents in verticals (law, finance, healthcare) can pair domain expertise with open weights.

Capabilities, AGI, and Data Constraints

Several see progress as gradual and “moat-less”: open models already nip at frontier labs, especially with good harnesses/agents.
Debate on whether we are compute-limited or data-limited:
- One side: high-quality data is the bottleneck; synthetic data helps only conditionally and can cause “model collapse” if overused.
- Other side: synthetic data plus partner datasets (biotech, engineering, etc.) can extend scaling.
Recursive self-improvement / “singularity” is viewed as overhyped; scaling compute dominates.

Practical Use of Open-Weight and Local Models

Multiple commenters report doing substantial work (coding agents, translation, niche CVE discovery, custom scripting) cheaply with local 12–35B models plus caching.
KV/prompt caching and agent loops can push effective token cost near zero for many workloads.
For many business tasks, human-level or even sub-human intelligence suffices; cost, control, and predictability matter more than absolute quality.
Some still find frontier coding models clearly superior for complex tasks.

Trust, Control, and Regulation

Strong preference from some for open weights due to: data control, auditability, avoiding arbitrary bans, and compliance predictability.
Others note local hosting can be “cheaper until you factor in security and liability,” which may push regulated industries toward big closed providers.
Expectation from several that governments (especially the US) will ban or constrain foreign/open models, tying favored domestic labs to state interests.

Space and Exotic Infrastructure (Data Centers in Orbit)

Long subthread on space-based data centers:
- Optimists note SpaceX’s track record (reusable boosters, Starlink, on-orbit power and AI chips) and see orbital compute as engineering-feasible.
- Skeptics stress economics and scaling: radiation, cooling, maintenance, launch costs, and the need for massive mass-to-orbit make large-scale deployment likely uneconomic for decades, if ever.
- General consensus: technically possible in niche form; commercial viability at scale is unclear.

Open Models, Market Structure, and Europe

Some argue open weights themselves act as a moat: any new commercial model must beat strong free/cheap baselines, making new entrants (e.g., from Europe) harder to justify.
Others point to existing European efforts and long-run benefits of open infrastructure.

Conceptual Clarifications: “Open Weights”

“Open weights” = you can download the full parameter tensors and associated files (tokenizer, architecture metadata, configs) and run inference yourself.
Fully “open models” also publish training code and datasets; these are less common.
Weights are just large numeric matrices governing the model’s behavior; open weights enable fine-tuning and local deployment without revealing original training data.

Related topics