2026-03-18

Measuring progress toward AGI: A cognitive framework

Scope and purpose of the framework

Many welcome having any structured benchmark to ground “are we at AGI yet?” debates.
Others see it as thin content and marketing: a Kaggle-style leaderboard dressed up as a “cognitive framework,” plus a small prize pool.
Some argue it’s effectively “crowdsourcing the goalposts” so companies can later claim AGI by definition.

Debates on defining and measuring AGI

The listed cognitive skills (perception, generation, attention, learning, etc.) are seen as reasonable but incomplete or too narrow.
Alternative taxonomies (working memory, processing speed, fluid/crystallized intelligence, pattern recognition, spatial reasoning) are proposed.
Critics say many humans wouldn’t pass these metrics, yet are clearly generally intelligent, while current AIs excel at narrow expert tasks.
Several note that AGI remains undefined; any claim that “LLMs will / won’t scale to AGI” is partly semantic.

What counts as intelligence?

Ongoing argument over whether intelligence is:
- Capacity to accomplish tasks vs. capacity to originate and pursue goals.
- Distinct from knowledge, or inseparable from it.
Some stress that intelligence exists on spectra and along multiple dimensions; others object when specific abilities (e.g., vivid imagery) become criteria that would exclude many humans.

LLM capabilities vs limitations

Enthusiasts highlight dramatic gains: multi-thousand-line code generation, broad competence across domains, passing informal Turing tests, and impressive text synthesis.
Skeptics emphasize:
- Lack of reliable unsupervised performance.
- Weak mid-conversation learning and physical intuition.
- Limited true novelty and invention when deprived of training data.
- Overhyped claims (replacing engineers soon, flawless legal/medical/financial use, autonomous operations).

Social cognition, alignment, and behavior

Including social cognition as a core benchmark is controversial:
- Some see it as central for any system interacting with humans.
- Others note it conflates “navigating society effectively” with prosocial behavior; this could favor manipulative or malevolent agents.
Several argue benchmarks should include explicit unwanted behaviors (alignment, non-harm) alongside capabilities.

Consciousness and sentience

Multiple commenters think the missing dimension is consciousness or will: intrinsic goals, continuity of experience, and self-driven motivation.
Others respond that:
- Consciousness vs. will are distinct.
- We can never directly verify anyone else’s consciousness, human or machine.
- Consciousness may be unnecessary—and even undesirable—for AGI used as a tool.
Views split between materialist “emergent property” accounts and more spiritual or dualist perspectives, with no consensus.

Societal stakes and attitudes

Some see AGI pursuit as a vanity or profit project; others predict major labor disruption even without true AGI.
There is tension between excitement over current capabilities and exhaustion with hype, undefined terms, and unresolved safety/ethical issues.

Related topics