The unbearable cheapness of open weight models

Economics of Open-Weight vs Closed Models

  • Many argue inference is already “dirt cheap” and getting cheaper; training and RLHF/RLAIF are the real cost drivers.
  • Open-weight providers (e.g., DeepSeek, Xiaomi) are seen as lacking brand/technical moat, so they must price at commodity levels.
  • Others question sustainability of ultra-low prices, citing GPU depreciation and claims that some providers’ margins look unrealistic.
  • There’s disagreement on whether frontier labs’ inference margins are ~50% (constrained by costs) or >90% (with cash burn mostly from training and org bloat).
  • Subscriptions vs API: API tokens look expensive; flat-rate plans likely subsidized, but some think they’re still profitable due to unused quota.

Business Models and Moats for Frontier Labs

  • Concern that open-weight models will commoditize “95% of tasks” (coding, research, cowork-style work), dragging down prices.
  • Proposed survival paths:
    • Push frontier capabilities (AGI, advanced science, biotech, nuclear design) and take royalties.
    • Own the application layer (enterprise tools, vertical products) and sell “best in class, safe, compliant” to large orgs.
    • Regulatory capture: “certified/regulated models only,” federal procurement rules, copyright licenses as gatekeeping.
    • Temporarily buying up scarce hardware (RAM/GPUs) to delay competitors until IPO.
  • Skeptics doubt their app-design chops and note incumbents in verticals (law, finance, healthcare) can pair domain expertise with open weights.

Capabilities, AGI, and Data Constraints

  • Several see progress as gradual and “moat-less”: open models already nip at frontier labs, especially with good harnesses/agents.
  • Debate on whether we are compute-limited or data-limited:
    • One side: high-quality data is the bottleneck; synthetic data helps only conditionally and can cause “model collapse” if overused.
    • Other side: synthetic data plus partner datasets (biotech, engineering, etc.) can extend scaling.
  • Recursive self-improvement / “singularity” is viewed as overhyped; scaling compute dominates.

Practical Use of Open-Weight and Local Models

  • Multiple commenters report doing substantial work (coding agents, translation, niche CVE discovery, custom scripting) cheaply with local 12–35B models plus caching.
  • KV/prompt caching and agent loops can push effective token cost near zero for many workloads.
  • For many business tasks, human-level or even sub-human intelligence suffices; cost, control, and predictability matter more than absolute quality.
  • Some still find frontier coding models clearly superior for complex tasks.

Trust, Control, and Regulation

  • Strong preference from some for open weights due to: data control, auditability, avoiding arbitrary bans, and compliance predictability.
  • Others note local hosting can be “cheaper until you factor in security and liability,” which may push regulated industries toward big closed providers.
  • Expectation from several that governments (especially the US) will ban or constrain foreign/open models, tying favored domestic labs to state interests.

Space and Exotic Infrastructure (Data Centers in Orbit)

  • Long subthread on space-based data centers:
    • Optimists note SpaceX’s track record (reusable boosters, Starlink, on-orbit power and AI chips) and see orbital compute as engineering-feasible.
    • Skeptics stress economics and scaling: radiation, cooling, maintenance, launch costs, and the need for massive mass-to-orbit make large-scale deployment likely uneconomic for decades, if ever.
    • General consensus: technically possible in niche form; commercial viability at scale is unclear.

Open Models, Market Structure, and Europe

  • Some argue open weights themselves act as a moat: any new commercial model must beat strong free/cheap baselines, making new entrants (e.g., from Europe) harder to justify.
  • Others point to existing European efforts and long-run benefits of open infrastructure.

Conceptual Clarifications: “Open Weights”

  • “Open weights” = you can download the full parameter tensors and associated files (tokenizer, architecture metadata, configs) and run inference yourself.
  • Fully “open models” also publish training code and datasets; these are less common.
  • Weights are just large numeric matrices governing the model’s behavior; open weights enable fine-tuning and local deployment without revealing original training data.