Large models of what? Mistaking engineering achievements for linguistic agency
Embodiment, “Languaging,” and the Paper’s Core Claim
- Paper argues LLMs lack embodiment, interaction with the real world, and “linguistic agency” (multiple simultaneous goals in communication).
- Some see this as basically correct but obvious and somewhat tautological: it restates what skeptics already believe.
- Others say it’s dated: ignores multimodal models and extensive interactive post‑training, so the video‑game / “brain in a vat” analogy is too rigid.
Training, Feedback, and Post‑Training Methods
- Discussion of RLHF, RLAIF, and newer methods like DPO; all rely on human preference data but differ in how reward is modeled.
- User–model conversations likely feed future training; this blurs the paper’s framing of LLMs as trained only on static corpora.
- Some note that long chains of interactive correction and retraining don’t yet have a standard name; it’s just “how training works.”
Capabilities vs. Limitations
- Several commenters report long, coherent dialogues with recent models, contradicting the paper’s example where the model “loses the thread” quickly.
- LLMs can often perform abstract reasoning on novel, symbolic problems, especially in idealized textbook forms.
- Critics counter that failures at simple arithmetic and brittle reasoning show a lack of underlying concepts; successes are attributed to pattern matching on recurring forms.
- There’s dispute over whether next‑token prediction inherently precludes internal world models; some argue any computable process can be cast as such, others insist current systems are just high‑dimensional curve fits.
Intelligence, AGI, and Definitions
- Repeated theme: we lack precise, agreed definitions of intelligence, consciousness, and AGI, making “LLMs can’t be AGI” or “LLMs think” claims hard to settle.
- One camp: sufficiently advanced behavioral mimicry just is the thing (language, intelligence) under physicalism.
- Other camp: embodiment, stakes, and non‑linguistic experience are essential; text‑only models can at best approximate.
- Some suggest using consensus and obviousness (as with recognizing “flight”) as a pragmatic criterion for intelligence; others point out historical failures to recognize the intelligence of animals or other human groups.
Hype, Value, and Research Trajectory
- Practitioners describe concrete but narrow wins: using LLMs for text structuring and data engineering vs. over‑engineered “agents” without clear business needs.
- Disagreement over scaling laws: some think more data/parameters will eventually hit hard limits; others expect continued gains with better training and hybrid architectures.
- Overall, thread balances excitement about practical capabilities with skepticism about strong claims of understanding, agency, or inevitable AGI.