Measuring progress toward AGI: A cognitive framework

Scope and purpose of the framework

  • Many welcome having any structured benchmark to ground “are we at AGI yet?” debates.
  • Others see it as thin content and marketing: a Kaggle-style leaderboard dressed up as a “cognitive framework,” plus a small prize pool.
  • Some argue it’s effectively “crowdsourcing the goalposts” so companies can later claim AGI by definition.

Debates on defining and measuring AGI

  • The listed cognitive skills (perception, generation, attention, learning, etc.) are seen as reasonable but incomplete or too narrow.
  • Alternative taxonomies (working memory, processing speed, fluid/crystallized intelligence, pattern recognition, spatial reasoning) are proposed.
  • Critics say many humans wouldn’t pass these metrics, yet are clearly generally intelligent, while current AIs excel at narrow expert tasks.
  • Several note that AGI remains undefined; any claim that “LLMs will / won’t scale to AGI” is partly semantic.

What counts as intelligence?

  • Ongoing argument over whether intelligence is:
    • Capacity to accomplish tasks vs. capacity to originate and pursue goals.
    • Distinct from knowledge, or inseparable from it.
  • Some stress that intelligence exists on spectra and along multiple dimensions; others object when specific abilities (e.g., vivid imagery) become criteria that would exclude many humans.

LLM capabilities vs limitations

  • Enthusiasts highlight dramatic gains: multi-thousand-line code generation, broad competence across domains, passing informal Turing tests, and impressive text synthesis.
  • Skeptics emphasize:
    • Lack of reliable unsupervised performance.
    • Weak mid-conversation learning and physical intuition.
    • Limited true novelty and invention when deprived of training data.
    • Overhyped claims (replacing engineers soon, flawless legal/medical/financial use, autonomous operations).

Social cognition, alignment, and behavior

  • Including social cognition as a core benchmark is controversial:
    • Some see it as central for any system interacting with humans.
    • Others note it conflates “navigating society effectively” with prosocial behavior; this could favor manipulative or malevolent agents.
  • Several argue benchmarks should include explicit unwanted behaviors (alignment, non-harm) alongside capabilities.

Consciousness and sentience

  • Multiple commenters think the missing dimension is consciousness or will: intrinsic goals, continuity of experience, and self-driven motivation.
  • Others respond that:
    • Consciousness vs. will are distinct.
    • We can never directly verify anyone else’s consciousness, human or machine.
    • Consciousness may be unnecessary—and even undesirable—for AGI used as a tool.
  • Views split between materialist “emergent property” accounts and more spiritual or dualist perspectives, with no consensus.

Societal stakes and attitudes

  • Some see AGI pursuit as a vanity or profit project; others predict major labor disruption even without true AGI.
  • There is tension between excitement over current capabilities and exhaustion with hype, undefined terms, and unresolved safety/ethical issues.