2024-09-24

Xkcd 1425 (Tasks) turns ten years old today

Bird vs. park: which task is actually harder now?

Many note the comic’s reversal: bird detection is now “trivial” with off‑the‑shelf CNN/YOLO models; a few lines of code can detect birds in images.
Others point out constraints the comic ignored: doing this on a 2014 phone, or without today’s models and tooling, would still have been hard.
The “is this in a national park?” task hides complexity: GPS inaccuracy, bad reception, boundary ambiguity (rivers, cities in parks), changing geography, and unclear requirements for “in” vs “near” a park.
GIS lookup itself (point‑in‑polygon against public park polygons) is technically straightforward now, but still depends on decades of prior work and datasets.

Progress in ML and shifting goalposts

Commenters reflect on how impressive GANs and early convnets looked a decade ago versus today’s standards.
Some argue expectations for AGI and the Turing test have been continuously “moved”; others say tests must naturally get stricter as systems improve.
There’s debate whether current LLMs count as anything close to AGI, with some insisting core capabilities (strong reasoning, robust math, one‑shot learning) are still missing.

Turing test, “real thinking,” and philosophy

Long sub‑thread on whether LLMs “really think” or just simulate language; references to Turing’s original framing, functionalism, and the Chinese Room.
Some stress that questions like “does it think?” are partly semantic; others see deep qualitative gaps between human and model reasoning, despite superficial fluency.

Jobs, productivity, and “good enough” automation

Disagreement on how many current jobs could be replaced today: some see many roles as trivially automatable; others say almost no jobs are fully replaceable yet.
Concern that AI will be used where it’s cheaper but worse (translation, customer service), leading to degraded quality but higher profits.
Others see AI enabling new jobs in places that previously couldn’t afford human labor at all.

Why similar‑looking software tasks differ by orders of magnitude

Many emphasize the comic’s core point: non‑technical stakeholders often can’t tell which tasks are “move a chair” versus “move the toilet and plumbing.”
House and plumbing analogies are used to explain why tiny‑sounding requirements (e.g., “robust MI,” “small UI tweak,” second optimistic‑update path) can imply major re‑architecture.

LLMs’ capabilities and sharp edges

LLMs excel at data cleanup, metadata normalization, simple CV (vision APIs), and rapid prototyping; they’re described as “ultimate interns.”
Failure modes are highlighted: hallucinations, weak arithmetic, reluctance to say “I don’t know,” and trouble with tokenization‑sensitive tasks (spelling “STRAWBERRY,” rendering “HELLO” in vegetables, negation like “no cheese”).
Some see these as inherent to current token‑based architectures; others think better training, tools, or hybrid systems can mitigate many issues.

Infrastructure, maturity, and “easy vs hard”

Multiple comments stress that both tasks (GPS/GIS and bird detection) became “easy” only after huge investments in satellites, GIS standards, datasets, GPUs, and ML research.
The comic is read less as a statement about vision per se and more as a reminder that perceived vs actual difficulty in software depends heavily on invisible infrastructure and historical context.

Related topics