2024-11-13

OpenAI, Google and Anthropic are struggling to build more advanced AI

Perceived Plateau vs Continued Progress

Many see signs that transformer scaling is hitting diminishing returns: new frontier models beat predecessors but not by GPT‑3→GPT‑4‑style leaps, especially on coding and reasoning.
Others argue this is just a normal plateau after a breakthrough, analogous to past tech cycles; progress is now slower and more engineering‑driven, not over.
Some point to recent models (e.g., o1, Claude 3.5, Gemini updates) as evidence that meaningful gains are still coming, though more incrementally.

Scaling Laws, Data and Synthetic Data

Several comments say data, not compute, is the bottleneck: high‑quality human text/code is finite; web data is being exhausted or walled off.
Concerns that synthetic data and models training on their own outputs lead to “model collapse” and information‑theoretic limits.
Others counter that there is still abundant untapped multimodal data (video, audio, in‑situ robot data) and better ways to use existing data.

Definitions and Expectations of AGI

Strong disagreement on what AGI means:
- Industry‑style: “systems that outperform humans on most economically valuable tasks.”
- Pop‑culture: self‑aware, conscious, human‑like minds.
Some argue vendors intentionally blur definitions to hype progress and keep an escape hatch (“we never meant that AGI”).
Persistent philosophical disputes about self‑awareness, consciousness, and whether behavior alone is enough (Chinese Room, “hard problem”).

Agents, Embodiment and Memory

One camp thinks “agents” with tools, long‑running tasks, and robot bodies (embodied cognition) are the next big step and path to AGI.
Skeptics see multi‑agent systems and current robotics as overhyped, with self‑driving and humanoid demos cited as cautionary.
Many highlight missing ingredients: persistent, editable long‑term memory; online learning; knowing what you don’t know; and meta‑reasoning about goals.

Capabilities, Failure Modes and Use Cases

Strong agreement that current LLMs are extremely useful for autocomplete‑like coding, drafting text, tutoring, support triage, and classification—if kept in “low‑risk search” or human‑in‑the‑loop roles.
Hallucinations, lack of calibrated confidence, and brittle reasoning remain core blockers for mission‑critical or fully autonomous use.
Some argue we’ve barely explored what today’s models can do via better orchestration (RAG, tools, knowledge graphs, fine‑tunes); others report disappointing real‑world reliability and retention.

Economics, Hype and Bubble Risk

Widespread suspicion that top labs and GPU vendors are over‑promising to justify massive capex; comparisons to dot‑com and crypto bubbles recur.
Some expect an “AI winter” or at least a sharp correction if scaling stalls before revenue catches up; others think even sub‑AGI productivity tools justify large businesses.

Related topics