Project Aardvark: reimagining AI weather prediction

Paper versions, code, and data

  • Commenters note substantial differences between the arXiv and Nature versions; this is framed as normal revision through peer review and journal production.
  • Code/data are said to be released via a Zenodo archive (~13 GB). Some confusion around an earlier GitHub link vs the final Zenodo bundle, but no one reports having fully inspected it yet.

Scope, climate change, and robustness

  • Debate centers on whether models trained on historical data will degrade as climate shifts.
  • One side argues: climate change is slow relative to 2–14 day weather forecasts and doesn’t alter the physics, just initial conditions; models are continually updated.
  • Critics counter: large-scale circulation changes and increased variability/extremes can invalidate tuned assumptions and parameterizations faster than models are revised, potentially underestimating variance even if means are OK.
  • Several agree that robustness under climate change is a major open question for ML-based forecasting.

Accuracy, resolution, and practical expectations

  • A challenge is posed: accurate (±3°F) forecasts for Kansas City more than two days out; several users report large swings until day-of.
  • Others respond that:
    • Instrument accuracy and spatial variability make ±3°F citywide unrealistic.
    • Backyard-scale frost prediction is particularly hard; microclimates can differ by >10°F even within a city.
  • Aardvark’s 1.5° grid is noted as coarse compared to operational models (0.25–0.5°), but still adequate for large-scale/synoptic patterns; local forecasts typically rely on statistical downscaling.

Model design and comparison to other AI weather models

  • Distinction is drawn between:
    • Conventional AI weather models (e.g., GraphCast/GenCast, MetNet), which take a gridded “analysis state” such as ERA5 as input.
    • Aardvark, which is described as “end-to-end,” aiming to ingest raw observations directly and bypass traditional data-assimilation steps.
  • Some details remain unclear in the thread about the exact training data mix (ERA5 vs observations).

Data infrastructure and evaluation

  • Multiple global and regional archives are listed (NOAA, ECMWF, EUMETSAT, Copernicus/ERA5, national services); there is no single universal clearinghouse.
  • Concepts of “hindcasting”/backtesting are discussed for validating new models on historical data.
  • Concerns are raised about potential political cuts to observational networks (e.g., balloons, NOAA), which would directly degrade both traditional and AI forecasts.

Compute, decentralization, and naming

  • Running state-of-the-art forecasts on a desktop is seen as a big reduction in required compute, enabling wider access and possibly stronger models when scaled back up to supercomputers.
  • Local/desktop models could improve privacy (no remote query of location).
  • “Aardvark” is praised as a name that alphabetically tops model lists.