2025-09-15

Language models pack billions of concepts into 12k dimensions

Orthogonality, binary vectors, and quasi-orthogonality

Thread debates “orthogonal” binary vectors: strict orthogonality via dot product vs “no shared 1-bits” vs XOR over GF(2).
Several people note you can’t have more than n mutually orthogonal vectors in n dimensions, but you can have many quasi-orthogonal bitstrings (small overlaps).
One proposal: use long sparse bit vectors (e.g. 1000 bits with 10 ones per concept) so many concepts can co-exist in a single vector with low overlap, akin to coding theory / spherical codes.

JL lemma, superposition, and sparse autoencoders

Commenters connect the Johnson–Lindenstrauss (JL) lemma and “near-orthogonality” to the superposition hypothesis and Sparse Autoencoders (SAEs) in mechanistic interpretability.
SAEs try to recover sparse, nearly-orthogonal “features” from dense activations; this matches the idea of many quasi-orthogonal concepts in a high‑dimensional space.

Capacity of high-dimensional spaces and ‘number of concepts’

Some intuitions are combinatorial (2^k, 3^k, factorial counts), but others push back that this confuses “possible vectors” with meaningful “concepts.”
One camp thinks 1k–20k dimensions is more than enough for human‑scale knowledge; another says the article overestimates capacity because what matters is preserving relative distances and rankings, not just almost-orthogonality.
A separate critique calls the “10^200 concepts in 12k dimensions” claim absurd in information-theoretic terms and conflating geometry with Shannon capacity.

Topological vs metric preservation and folding

A long subthread distinguishes JL’s guarantees for finite point sets from embedding the entire underlying manifold (Takens/Whitney/Sauer–Yorke).
Argument: with fixed dimension k, refining resolution inevitably causes “folding” — distant regions of the true manifold map close together, potentially explaining some LLM pathologies.
Others ask for concrete empirical examples and suggest this may be a theoretical rather than dominant practical issue.

How LLMs actually store concepts

Multiple comments stress that models don’t assign one dimension per concept or enforce orthogonality; “understanding” emerges from the whole network, non-linearities, and attention, not just raw embedding geometry.
KV cache and many layers massively expand effective representational space beyond a single 12k‑dim vector.
Some note that non-linearities (e.g. softmax, GeLU) and normalization mean vectors need not be orthogonal; you can disambiguate many items even in low dimensions.

Peer review, blog papers, and AI-written style

Long debate on blog-style mechanistic interpretability work: high impact and widely cited vs “sloppy,” analogy-heavy, and lacking formal peer review.
Several argue ML conference peer review is currently dysfunctional; others say formal review would still force clearer definitions and less hand‑wavy claims.
Distinct subthread complains the article’s tone feels like LLM-generated “AI slop”: overuse of superlatives, formulaic structure, and internal inconsistencies (e.g., misinterpreted constants, spherical-code-like arguments).
Counterpoint: using an LLM for wording doesn’t invalidate the underlying math or experiments, though it can mask errors and erode trust.

Semantics vs syntax in LLMs

One view: LLMs don’t contain “real-world concepts,” only syntactic token relationships; any semantics live in human interpretation.
Others counter that models handle homonyms and category judgments in ways that align with semantic distinctions, and that syntax-only pattern matching is too weak an explanation.
No consensus: some insist “reasoning” talk is overclaim; others see emergent semantic structure in embeddings and behavior.

Miscellaneous points and open questions

Questions about what actually enforces (near-)orthogonality during training go unanswered; it’s implied to be an emergent consequence of loss, architecture, and normalization.
Some argue there aren’t “billions of human concepts” in the strict philosophical sense, so capacity claims may be solving the wrong problem.
A late comment notes tension between this theory-heavy “huge capacity” narrative and empirical work finding limited semantic capacity for some embedding uses; the reconciliation is left unclear.

Related topics