2025-03-23

Project Aardvark: reimagining AI weather prediction

Paper versions, code, and data

Commenters note substantial differences between the arXiv and Nature versions; this is framed as normal revision through peer review and journal production.
Code/data are said to be released via a Zenodo archive (~13 GB). Some confusion around an earlier GitHub link vs the final Zenodo bundle, but no one reports having fully inspected it yet.

Scope, climate change, and robustness

Debate centers on whether models trained on historical data will degrade as climate shifts.
One side argues: climate change is slow relative to 2–14 day weather forecasts and doesn’t alter the physics, just initial conditions; models are continually updated.
Critics counter: large-scale circulation changes and increased variability/extremes can invalidate tuned assumptions and parameterizations faster than models are revised, potentially underestimating variance even if means are OK.
Several agree that robustness under climate change is a major open question for ML-based forecasting.

Accuracy, resolution, and practical expectations

A challenge is posed: accurate (±3°F) forecasts for Kansas City more than two days out; several users report large swings until day-of.
Others respond that:
- Instrument accuracy and spatial variability make ±3°F citywide unrealistic.
- Backyard-scale frost prediction is particularly hard; microclimates can differ by >10°F even within a city.
Aardvark’s ~~1.5° grid is noted as coarse compared to operational models (~~0.25–0.5°), but still adequate for large-scale/synoptic patterns; local forecasts typically rely on statistical downscaling.

Model design and comparison to other AI weather models

Distinction is drawn between:
- Conventional AI weather models (e.g., GraphCast/GenCast, MetNet), which take a gridded “analysis state” such as ERA5 as input.
- Aardvark, which is described as “end-to-end,” aiming to ingest raw observations directly and bypass traditional data-assimilation steps.
Some details remain unclear in the thread about the exact training data mix (ERA5 vs observations).

Data infrastructure and evaluation

Multiple global and regional archives are listed (NOAA, ECMWF, EUMETSAT, Copernicus/ERA5, national services); there is no single universal clearinghouse.
Concepts of “hindcasting”/backtesting are discussed for validating new models on historical data.
Concerns are raised about potential political cuts to observational networks (e.g., balloons, NOAA), which would directly degrade both traditional and AI forecasts.

Compute, decentralization, and naming

Running state-of-the-art forecasts on a desktop is seen as a big reduction in required compute, enabling wider access and possibly stronger models when scaled back up to supercomputers.
Local/desktop models could improve privacy (no remote query of location).
“Aardvark” is praised as a name that alphabetically tops model lists.

Related topics