2026-04-09

Claude mixes up who said what

Nature of the “who said what” bug

Claude Code sometimes treats its own internal or assistant messages as if they were user messages, then confidently insists “you said that.”
Similar misattribution shows up in other systems (ChatGPT, Gemini, Copilot CLI, agents), especially in long or tool-heavy sessions.
Unclear whether this is purely a harness/UI bug (mislabeling roles) or a model behavior; several comments argue strongly it’s at least partly a model limitation.

Context windows, “dumb zone,” and degeneration

Many report LLMs degrading in long chats: forgetting instructions, losing tool-calling discipline, emitting raw JSON, repeating earlier prompts, or failing to respond.
Approaching context limits is described as a “dumb zone” or “decoherence” phase where role attribution, negation (“don’t do X”), and even basic behaviors break down.
Compaction/summarization of context may worsen confusion about who said what.

Determinism, chaos, and guarantees

Long subthread on determinism:
- At token level, models can be made reproducible (fixed seed, temp=0, careful hardware).
- But small prompt changes cause large, unpredictable semantic shifts; behavior is “chaotic” even if technically deterministic.
Several argue you cannot deterministically guarantee output properties (e.g., “never do X”) in the way you can with parameterized SQL or traditional code.

Data vs. control, prompt injection, and security

Core security concern: no hard architectural boundary between data and instructions; everything is just tokens in one stream.
Prompt injection is likened more to social engineering than SQL injection; you can only mitigate, not eliminate it, without destroying general-purpose usefulness.
Some argue LLMs should always be treated as untrusted users with limited, sandboxed permissions.

Proposed mitigations and design ideas

Better role/speaker encoding: “colored” tokens, speaker embeddings, or separate input channels for system/user/tool.
Stronger tool boundary enforcement: cryptographically constrained tool arguments, fine-grained permissions, and post-hoc filters.
Shorter or frequently refreshed chats; explicit “handoff documents” before compaction; restarting sessions after big mistakes.
Use LLMs as juniors: helpful but always supervised; never given unchecked access to critical systems.

Overall sentiment

Mix of fascination with capability (especially for coding) and deep unease about reliability, safety, and marketing overreach.
Consensus that current systems remain brittle, probabilistic tools, not robust autonomous agents.

Related topics