Bag of words, have mercy on us
Metaphors and Mental Models
- Many object to “bag of words” as a metaphor: it’s already a specific NLP term, sounds trivial, and doesn’t match how people actually use LLMs.
- Alternatives proposed: “superpowered autocomplete,” “glorified/luxury autocomplete,” “search engine that can remix results,” “spoken query language,” or “Library of Babel with compression and artifacts.”
- Some defend “bag of words” (or “word-hoard”) as deliberately anti-personal: a corrective to “silicon homunculus” metaphors, not a technical description.
Anthropomorphism and Interfaces
- Commenters repeatedly see people treat LLMs as thinking, feeling agents, despite repeated explanations that they’re predictors.
- Chat-style UIs, system prompts, memory, tool use, and human-like tone are seen as major anthropomorphizing scaffolding that hides the underlying mechanics.
- Some argue a less chatty, more “complete this text / call this tool” interface would reduce misplaced trust and quasi-religious attitudes.
Capabilities vs. “Just Autocomplete”
- Disagreement over whether “just prediction” is dismissive:
- Critics: next-token prediction on text ≠ modeling the physical world or doing reliable reasoning; models lack stable world models, meta-knowledge, and consistent self-critique.
- Defenders: prediction is central to human cognition too; given scale, tool use, feedback loops and agents, prediction-plus-scaffolding may cross into genuine problem solving.
- Examples cited both ways: impressive math/competition performance, code generation for novel ISAs vs. brittle reasoning, hallucinations, and inconsistency under minor prompt changes.
Human Cognition Comparisons
- Long subthread on whether all thinking is prediction: references to predictive processing / free-energy ideas vs. objections that this redefines “thinking” so broadly it loses usefulness.
- Some argue we don’t understand human thought or consciousness well enough to assert LLMs categorically “don’t think”; others say lack of learning at inference time, motivation, and embodiment are decisive differences.
Ethics, Risk, and Social Roles
- Underestimating LLMs risks missed opportunities; overestimating them risks delusion, over-delegation in high-stakes domains, and possible moral misclassification (either of humans or models).
- Economic concern: many “word-only” roles may be replaceable if a “magic bag of words” is good enough for employers.
- Creative concern: several insist they value works because humans made them, akin to the “forklift at the gym” analogy; others see AI as acceptable when the goal is output, not personal growth.
Interpretability and Inner Structure
- Interpretability work (e.g., concept neurons, cross-lingual features, confidence/introspection signals) is cited as evidence of internal structure beyond naive bag-of-words.
- Skeptics counter that much of this research is unreviewed, commercially motivated, and doesn’t yet demonstrate human-like understanding or robust world models.