Google says AI weather model masters 15-day forecast
Scope and novelty of the model
- Commenters say this is a new DeepMind model (GenCast), significantly better than prior AI weather models.
- It’s trained on ~40 years of ERA5 reanalysis data (1979–2018) and evaluated mainly on 2019.
- Main selling points: 15‑day global forecasts, high claimed skill vs ECMWF, and much lower run-time (minutes vs hours on HPC).
Training, backtesting, and overfitting concerns
- Evaluation is largely on historical “held-out” data (2019), not on live, forward-in-time forecasts yet.
- Multiple commenters worry that hyperparameter tuning and model iteration on that test period quietly overfit, inflating apparent generalization.
- Others respond that backtesting on post‑training years is standard practice; 2019 is after the training window, so it is at least a genuine out-of-sample period in calendar time.
Accuracy, extremes, and tail risks
- Several people care less about “97% of cases” and more about the 3% that are wrong: are they trivial drizzle misses or catastrophic-storm failures?
- Concern that AI models may do great on common, stable regimes but fail badly on rare, high-impact events (bomb cyclones, unusual hurricanes).
- Some note traditional models also struggle here; questions about whether AI actually improves extreme-event skill.
AI vs physics-based / causal models
- One camp argues physics-based numerical models encode causal structure, are more interpretable, and handle distribution shifts better.
- Another notes that traditional models also have many heuristics and tuning; they’re not pure first-principles.
- A hybrid future is discussed: physics cores with ML components (e.g., neural differential equations, emulators of dynamical cores).
- Some evidence is cited that AI weather models reproduce classic dynamical behaviors, suggesting they’re more than naive pattern-matching.
Understanding vs pure prediction
- Several worry AI forecasts improve utility but not scientific understanding; weights are opaque.
- Counterargument: operational forecasting is about usable predictions; understanding can still be pursued separately, and AI outputs can themselves be studied.
Operational, institutional, and trust issues
- GenCast depends on ECMWF-style reanalysis and initial conditions; savings partly externalize to those systems.
- DeepMind claims code/weights/forecasts will be released; some suspect eventual monetization or lock-in.
- Skepticism about Google’s claims is fueled by past missteps (Google Flu Trends, Gemini rollout), though others point to DeepMind’s strong track record (e.g., Alpha* work).
Climate change and distribution shift
- Debate over how much changing climate and evolving weather statistics will degrade AI model skill over time.
- Some think underlying atmospheric dynamics are stable enough that regular retraining will suffice; others think future, shifted regimes could cause sharp accuracy drops.