OpenAI, Google and Anthropic are struggling to build more advanced AI

Perceived Plateau vs Continued Progress

  • Many see signs that transformer scaling is hitting diminishing returns: new frontier models beat predecessors but not by GPT‑3→GPT‑4‑style leaps, especially on coding and reasoning.
  • Others argue this is just a normal plateau after a breakthrough, analogous to past tech cycles; progress is now slower and more engineering‑driven, not over.
  • Some point to recent models (e.g., o1, Claude 3.5, Gemini updates) as evidence that meaningful gains are still coming, though more incrementally.

Scaling Laws, Data and Synthetic Data

  • Several comments say data, not compute, is the bottleneck: high‑quality human text/code is finite; web data is being exhausted or walled off.
  • Concerns that synthetic data and models training on their own outputs lead to “model collapse” and information‑theoretic limits.
  • Others counter that there is still abundant untapped multimodal data (video, audio, in‑situ robot data) and better ways to use existing data.

Definitions and Expectations of AGI

  • Strong disagreement on what AGI means:
    • Industry‑style: “systems that outperform humans on most economically valuable tasks.”
    • Pop‑culture: self‑aware, conscious, human‑like minds.
  • Some argue vendors intentionally blur definitions to hype progress and keep an escape hatch (“we never meant that AGI”).
  • Persistent philosophical disputes about self‑awareness, consciousness, and whether behavior alone is enough (Chinese Room, “hard problem”).

Agents, Embodiment and Memory

  • One camp thinks “agents” with tools, long‑running tasks, and robot bodies (embodied cognition) are the next big step and path to AGI.
  • Skeptics see multi‑agent systems and current robotics as overhyped, with self‑driving and humanoid demos cited as cautionary.
  • Many highlight missing ingredients: persistent, editable long‑term memory; online learning; knowing what you don’t know; and meta‑reasoning about goals.

Capabilities, Failure Modes and Use Cases

  • Strong agreement that current LLMs are extremely useful for autocomplete‑like coding, drafting text, tutoring, support triage, and classification—if kept in “low‑risk search” or human‑in‑the‑loop roles.
  • Hallucinations, lack of calibrated confidence, and brittle reasoning remain core blockers for mission‑critical or fully autonomous use.
  • Some argue we’ve barely explored what today’s models can do via better orchestration (RAG, tools, knowledge graphs, fine‑tunes); others report disappointing real‑world reliability and retention.

Economics, Hype and Bubble Risk

  • Widespread suspicion that top labs and GPU vendors are over‑promising to justify massive capex; comparisons to dot‑com and crypto bubbles recur.
  • Some expect an “AI winter” or at least a sharp correction if scaling stalls before revenue catches up; others think even sub‑AGI productivity tools justify large businesses.