The beginning of scarcity in AI
Hardware and energy bottlenecks
- Many argue the bottleneck is manufacturing capacity: especially EUV lithography tools and the complex fab supply chain; scaling is slow and risky due to past boom–bust cycles.
- Others point to power limits: turbine manufacturing and grid constraints make scaling datacenters difficult.
- There is debate over whether ASML-like tooling is the global bottleneck, versus the difficulty and cost of building full fabs and supporting infrastructure.
- Some note energy constraints are asymmetric: the US is seen as grid-limited; China as compute-limited but rapidly scaling wind/solar.
Is compute scarcity real or artificial?
- One camp sees genuine, multi‑year compute scarcity with GPU prices rising and high utilization.
- Another sees “artificial scarcity,” driven by hype, subsidized pricing, and investors chasing a bubble that may end in oversupply and cheap compute.
- There is disagreement over whether current AI demand is durable or more like past tech bubbles and crypto GPU spikes.
Model architectures, ASICs, and efficiency
- Transformer O(n²) scaling is seen as a fundamental limit; some expect new architectures (e.g., state-space hybrids) to reduce compute needs.
- Skepticism that ASIC inference will dominate soon: by the time an ASIC ships, models may be several generations ahead. ASICs likely make sense only once architectures stabilize.
Local and open models
- Many stress that open-weight models lag frontier systems by ~6–12 months but are already “good enough” for many business tasks.
- Local inference is seen as a way to bypass cloud compute scarcity and future price hikes, at the cost of weaker models and hardware constraints.
- Others counter that local models still lack nuance and remain closer to older frontier performance (e.g., GPT‑3.5).
Economics, pricing, and dependency risk
- Discussion of labs “burning dollars to buy oranges at $1 and sell at $0.50” to gain market share, with hopes that compute prices or margins improve later.
- Strong concern about depending on proprietary LLM APIs: AI-first products may face rising COGS and forced price hikes if token prices increase.
- Some predict a dot‑com–style cycle: massive overbuild of AI infra, followed by bankruptcies and cheap surplus compute; others think high margins and demand might persist.
- Valuations of frontier labs are widely viewed as stretched; profitability and true margins are seen as opaque and possibly overstated.
Innovation under constraints
- Scarcity is expected to drive:
- Better “harnesses” (wrappers, tools, and orchestration layers around models).
- Smaller, specialized models tailored to specific tasks and hardware constraints.
- Examples from China and constrained teams show that limited GPUs have already led to influential efficiency techniques.
- Some argue the real bottleneck isn’t compute but robust evaluation: without good measurement, cheaper or better models just let you make mistakes faster.
Broader outlook and skepticism
- Several commenters doubt AI will deliver the transformative productivity needed to justify current spend.
- Others expect that as mid‑tier models rapidly improve, many use cases will move off frontier APIs to cheaper local or open alternatives.
- Unclear how long the current “scarcity era” will last; many expect a familiar boom‑bust pattern, but timing and magnitude remain uncertain.