The unbearable cheapness of open weight models
Economics of Open-Weight vs Closed Models
- Many argue inference is already “dirt cheap” and getting cheaper; training and RLHF/RLAIF are the real cost drivers.
- Open-weight providers (e.g., DeepSeek, Xiaomi) are seen as lacking brand/technical moat, so they must price at commodity levels.
- Others question sustainability of ultra-low prices, citing GPU depreciation and claims that some providers’ margins look unrealistic.
- There’s disagreement on whether frontier labs’ inference margins are ~50% (constrained by costs) or >90% (with cash burn mostly from training and org bloat).
- Subscriptions vs API: API tokens look expensive; flat-rate plans likely subsidized, but some think they’re still profitable due to unused quota.
Business Models and Moats for Frontier Labs
- Concern that open-weight models will commoditize “95% of tasks” (coding, research, cowork-style work), dragging down prices.
- Proposed survival paths:
- Push frontier capabilities (AGI, advanced science, biotech, nuclear design) and take royalties.
- Own the application layer (enterprise tools, vertical products) and sell “best in class, safe, compliant” to large orgs.
- Regulatory capture: “certified/regulated models only,” federal procurement rules, copyright licenses as gatekeeping.
- Temporarily buying up scarce hardware (RAM/GPUs) to delay competitors until IPO.
- Skeptics doubt their app-design chops and note incumbents in verticals (law, finance, healthcare) can pair domain expertise with open weights.
Capabilities, AGI, and Data Constraints
- Several see progress as gradual and “moat-less”: open models already nip at frontier labs, especially with good harnesses/agents.
- Debate on whether we are compute-limited or data-limited:
- One side: high-quality data is the bottleneck; synthetic data helps only conditionally and can cause “model collapse” if overused.
- Other side: synthetic data plus partner datasets (biotech, engineering, etc.) can extend scaling.
- Recursive self-improvement / “singularity” is viewed as overhyped; scaling compute dominates.
Practical Use of Open-Weight and Local Models
- Multiple commenters report doing substantial work (coding agents, translation, niche CVE discovery, custom scripting) cheaply with local 12–35B models plus caching.
- KV/prompt caching and agent loops can push effective token cost near zero for many workloads.
- For many business tasks, human-level or even sub-human intelligence suffices; cost, control, and predictability matter more than absolute quality.
- Some still find frontier coding models clearly superior for complex tasks.
Trust, Control, and Regulation
- Strong preference from some for open weights due to: data control, auditability, avoiding arbitrary bans, and compliance predictability.
- Others note local hosting can be “cheaper until you factor in security and liability,” which may push regulated industries toward big closed providers.
- Expectation from several that governments (especially the US) will ban or constrain foreign/open models, tying favored domestic labs to state interests.
Space and Exotic Infrastructure (Data Centers in Orbit)
- Long subthread on space-based data centers:
- Optimists note SpaceX’s track record (reusable boosters, Starlink, on-orbit power and AI chips) and see orbital compute as engineering-feasible.
- Skeptics stress economics and scaling: radiation, cooling, maintenance, launch costs, and the need for massive mass-to-orbit make large-scale deployment likely uneconomic for decades, if ever.
- General consensus: technically possible in niche form; commercial viability at scale is unclear.
Open Models, Market Structure, and Europe
- Some argue open weights themselves act as a moat: any new commercial model must beat strong free/cheap baselines, making new entrants (e.g., from Europe) harder to justify.
- Others point to existing European efforts and long-run benefits of open infrastructure.
Conceptual Clarifications: “Open Weights”
- “Open weights” = you can download the full parameter tensors and associated files (tokenizer, architecture metadata, configs) and run inference yourself.
- Fully “open models” also publish training code and datasets; these are less common.
- Weights are just large numeric matrices governing the model’s behavior; open weights enable fine-tuning and local deployment without revealing original training data.