Measuring progress toward AGI: A cognitive framework
Scope and purpose of the framework
- Many welcome having any structured benchmark to ground “are we at AGI yet?” debates.
- Others see it as thin content and marketing: a Kaggle-style leaderboard dressed up as a “cognitive framework,” plus a small prize pool.
- Some argue it’s effectively “crowdsourcing the goalposts” so companies can later claim AGI by definition.
Debates on defining and measuring AGI
- The listed cognitive skills (perception, generation, attention, learning, etc.) are seen as reasonable but incomplete or too narrow.
- Alternative taxonomies (working memory, processing speed, fluid/crystallized intelligence, pattern recognition, spatial reasoning) are proposed.
- Critics say many humans wouldn’t pass these metrics, yet are clearly generally intelligent, while current AIs excel at narrow expert tasks.
- Several note that AGI remains undefined; any claim that “LLMs will / won’t scale to AGI” is partly semantic.
What counts as intelligence?
- Ongoing argument over whether intelligence is:
- Capacity to accomplish tasks vs. capacity to originate and pursue goals.
- Distinct from knowledge, or inseparable from it.
- Some stress that intelligence exists on spectra and along multiple dimensions; others object when specific abilities (e.g., vivid imagery) become criteria that would exclude many humans.
LLM capabilities vs limitations
- Enthusiasts highlight dramatic gains: multi-thousand-line code generation, broad competence across domains, passing informal Turing tests, and impressive text synthesis.
- Skeptics emphasize:
- Lack of reliable unsupervised performance.
- Weak mid-conversation learning and physical intuition.
- Limited true novelty and invention when deprived of training data.
- Overhyped claims (replacing engineers soon, flawless legal/medical/financial use, autonomous operations).
Social cognition, alignment, and behavior
- Including social cognition as a core benchmark is controversial:
- Some see it as central for any system interacting with humans.
- Others note it conflates “navigating society effectively” with prosocial behavior; this could favor manipulative or malevolent agents.
- Several argue benchmarks should include explicit unwanted behaviors (alignment, non-harm) alongside capabilities.
Consciousness and sentience
- Multiple commenters think the missing dimension is consciousness or will: intrinsic goals, continuity of experience, and self-driven motivation.
- Others respond that:
- Consciousness vs. will are distinct.
- We can never directly verify anyone else’s consciousness, human or machine.
- Consciousness may be unnecessary—and even undesirable—for AGI used as a tool.
- Views split between materialist “emergent property” accounts and more spiritual or dualist perspectives, with no consensus.
Societal stakes and attitudes
- Some see AGI pursuit as a vanity or profit project; others predict major labor disruption even without true AGI.
- There is tension between excitement over current capabilities and exhaustion with hype, undefined terms, and unresolved safety/ethical issues.