Is Winter Coming? (2024)

State of AI Progress and “Winter”

  • Some argue visible progress is slowing: newer models mostly improve speed, context size, and hallucination rates rather than delivering qualitatively new abilities. They want breakthroughs like near‑zero hallucinations, far better data efficiency, and explicit epistemic uncertainty.
  • Others see recent reasoning models as a clear step‑change, especially on math and structured reasoning, not just “more of the same.”
  • Several note that progress tends to be stepwise, not smooth, and that the “last 10%” of reliability may be the hardest yet most transformative.
  • There’s disagreement over whether current LLMs can ever become “real intelligence,” but also a strong view that we don’t need AGI for huge practical value.

Self‑Driving Cars as a Case Study in Hype vs Reality

  • One camp cites autonomous robo‑taxis (e.g., in US cities) as proof the old “self‑driving is hype” narrative is outdated: door‑to‑door rides, in real traffic, at scale.
  • Critics stress limitations: heavy pre‑mapping, geofencing, dependence on specific cities and conditions; by a lay understanding, that isn’t “completely autonomous” or Level 5.
  • Debate over “moving the goalposts”: skeptics say the original promise was cars that handle anything a human can, anywhere (e.g., Mumbai, Rome, cross‑country trips). Others say it’s normal to deploy gradually in easier domains.
  • This is used as an analogy for AI overall: impressive partial success vs broad, unconstrained competence.

Mental Models of LLMs and Agents

  • Multiple comments distinguish raw LLMs (statistical token predictors) from agents wrapped with tools like web search and retrieval. Confusing the two leads users to overtrust plain LLM answers.
  • Some defend the “just prediction” description as still essential for safety intuition; others note that with huge parameter spaces and attention, “stringing words together” can yield surprisingly deep transformations.

Prompting, Expertise, and Reliability

  • Several anecdotes show experts get better answers: using correct jargon seems to “route” the model toward higher‑quality training text, while lay phrasing elicits amateurish or outright wrong advice.
  • Domain knowledge also helps users spot errors and push back; non‑experts may accept flawed outputs, especially in finance, medicine, or math.
  • Techniques suggested: ask models to restate questions in expert language, set explicit context about your background, or first use them to learn domain vocabulary.
  • Others warn such “tips” are not reliably generalizable; models can contradict themselves, confidently defend wrong answers, or change correct ones when challenged.
  • Comparisons to search engines: query skill has always mattered, but with web search you can inspect sources; with LLMs, source provenance and misrepresentation are opaque.

AI Hype, Economics, and Future Winters

  • Some foresee another AI winter when monetization disappoints and the “race to zero” margins bites; others argue current AI spending dwarfs past cycles, making a full winter unlikely even if many bets fail.
  • A different “winter” is described inside firms: layoffs and strategic paralysis while management waits for AGI to magically fix productivity, which may harm real economic output.
  • Several note that “AI” is a moving target: once a technique works and becomes mundane (chess, search, LLMs), it stops being called “AI,” so expectations and goalposts keep shifting with each wave.

Writing Style and Discourse

  • Some readers criticize long, discursive AI essays as overextended for relatively simple theses. Others—especially long‑form bloggers—say length is needed to preempt nitpicks, fully defend positions, and “write to think,” even if that clashes with readers’ desire for concise arguments.