Training is not the same as chatting: LLMs don’t remember everything you say
Misconceptions: Training vs. Chatting
- Many users wrongly assume the model “learns from” each chat in real time and will do better next time because of their input.
- Several commenters stress the distinction between a fixed, pre-trained model vs. future training runs that may use aggregated logs.
- Some criticize the article’s framing as semantic or misleading: “doesn’t remember” vs. “is stored and may later influence future models.”
- Others defend the clarification as crucial, because users waste time thinking they’re “training” the model via usage.
Data Retention, Privacy, and Trust
- Commenters highlight an “AI trust crisis”: vendors claim not to train on user data, but many people don’t believe them.
- Economic incentives (e.g., paid data deals with platforms) drive suspicion that free user chats will also be exploited.
- Opt-out mechanisms exist but are not auditable; people assume worst-case.
- Even if current models don’t live-train, logs can be leaked, misused, or later repurposed, so sensitive data remains risky.
Quality and Usefulness of Chat Logs as Training Data
- Some argue chat logs are mostly low-quality: confused questions, mistakes, and rants, making them poor pretraining material.
- Others note they can still be valuable for feedback/RLHF, especially where users correct bad outputs or rate answers.
- Concern that including proprietary or personal information in training could cause damaging leakage in future responses.
Memory, Personalization, and RAG
- The new “memory” feature is discussed as a shallow system-prompt injection of short facts, not true weight changes.
- Several find it annoying or poorly filtered; it often stores trivial or context-specific details.
- Commenters describe more advanced patterns: RAG over conversation history, summarization, “cognitive compression,” and vector stores to simulate long-term memory.
- Distinction emphasized between model-level memory vs. service-layer memory and tooling.
Continuous Learning and Dynamic Evaluation
- Some see the lack of continual learning as the most disappointing limitation; others point out training is expensive, slow, and risky to update frequently.
- Techniques like dynamic evaluation, test-time adaptation, LoRAs, and prompt/soft-prompt tuning are mentioned as ways to update behavior on the fly, but they’re hard to deploy at scale.
- There’s interest in future “live-trained” or highly personalized models, especially on local devices.
User Understanding, UX, and Regulation
- Commenters report a large gap between expert mental models and everyday users’ expectations, reinforced by chat-style interfaces and anthropomorphic phrasing (“I’ll remember that”).
- Some propose education or even “AI licenses” for professional use; others resist adding barriers, comparing AI risks to existing internet and social-media harms.
- Overall, many see clearer communication about what is and isn’t remembered as essential product UX, not just a technical detail.