All AI models might be the same
Architecture vs Data and Model Limits
- Some argue architecture “doesn’t matter,” and convergent behavior mainly reflects shared data; others strongly disagree, likening this to saying algorithm choice is irrelevant.
- Critics note current LLMs are largely feed‑forward with discrete training cycles, unlike brains’ continuous feedback and learning, and our limited understanding of memory and neural dynamics may hide better architectures.
- Skeptics of Transformers claim true AGI will likely use very different mechanisms.
Shared Semantic Spaces and “One Model” Hypothesis
- The “Mussolini or Bread” game is used to argue for a shared latent semantic space across people and models; many find this compelling but limited to overlapping knowledge and culture.
- Several commenters point out logical flaws in the game’s reasoning (e.g., many non‑person concepts could still be “closer to Mussolini than bread,” non‑transitive relations).
- Some see the effect as mostly due to similar training corpora and consistent estimators, not deep Platonic structure.
Diffusion Models, Compression, and Memorization
- A highlighted paper shows optimal diffusion models act like patch mosaics from training data; using this directly would produce huge, unwieldy systems.
- Others caution against taking the “patch mosaic” metaphor too literally: real models aren’t perfectly minimized, are overfit on small benchmarks, and succeed largely because imperfect training enables interpolation, correction, and decomposition tasks.
- Debate continues on whether convergent representations can yield smaller models, or inherently push toward larger architectures approximating the data.
Language, Translation, and Non‑Human Communication
- There’s extensive debate over whether shared embedding spaces could let us translate whale or lion communication without a “Rosetta stone.”
- One side emphasizes shared physical world and experiences (hunger, sun, movement) and the possibility of mapping contexts and abstractions across species.
- The other stresses that meaning is deeply tied to lived, species‑specific experience (Wittgenstein’s lion), that some concepts may be untranslatable or extremely lossy, and that current methods already struggle across human cultures.
- Relatedly, people discuss universal grammar, animal symbolic ability (e.g., apes, dolphins, elephants), and projects like dolphin‑focused LLMs; views range from “humans just have special grammar hardware” to “other species lack only an effective, fitness‑linked naming system.”
LLM Capabilities, Hallucinations, and Domain Use
- In practice, LLMs sometimes fail at simple semantic games (Mussolini/Bread) without heavy prompting.
- A user report on a backup‑software assistant shows plausible but hallucinatory instructions; they conclude domain‑specific LLMs need strong fact‑checking and that good documentation plus search often remain superior.
- Others note that different models often give remarkably similar answers, which they attribute to similar architectures and overlapping corpora.
Intelligence, Learning, and AGI Debates
- Some commenters see LLMs as brute‑force reverse‑engineered human brains, matching input–output behavior over huge datasets.
- Others insist LLMs “don’t think, learn, or exhibit intelligence,” comparing them to static books, pointing to lack of persistent self‑updating without retraining.
- Opponents counter with analogies to neurons (simple units giving rise to emergent cognition) and argue that training + context + external memory already approximate forms of learning.
- Dynamic tokenization and continual‑learning schemes are discussed as necessary steps toward more “alive” systems.
- There is disagreement over whether LLMs can ever yield AGI, with some viewing Transformers as a dead end and others treating them as early approximations of a single underlying “intelligence.”
Ethics, Alignment, and Platonic Forms
- Some tie the convergence of concepts (“dog,” “house,” etc.) to Plato’s Forms and speculate about a learnable “Form of the Good” that could aid alignment.
- Others note that moral notions (abortion, animal testing, etc.) are highly contested even within one culture, so a universal “Good” embedding is dubious.
- A few liken this to Jungian archetypes or to deep, overloaded words in natural language.
Implications for Open Source and Future Work
- If all large models converge on essentially the same representation of the world, one strong open‑source model might eventually substitute for proprietary ones.
- Suggested empirical tests include training small models on very different corpora (e.g., disjoint historical traditions) to see whether their embeddings can be mapped into a common space.