Regression: malware reminder on every read still causes subagent refusals
Bug and malware-prompt regression
- Core issue: a system prompt in Claude’s managed agents / Claude Code says “whenever you read a file … you MUST refuse to improve or augment the code,” with no conditional “if it’s malware.”
- Result: agents often refuse to modify any code after reading it, even when users clarify it isn’t malware.
- Some users report they can override it by explicitly stating the code is not malware; others still see refusals.
- The malware-analysis-on-every-file behavior is seen as wasteful, context-bloating, and poorly thought through.
Token usage, context bloat, and cost
- Several commenters report large, unexplained token use: heavy hidden prompts, repeated malware checks, and verbose “thoughts.”
- Some find managed harnesses double the context usage vs lighter custom harnesses (e.g., 30–40k vs ~15k tokens just to say “hi”).
- People worry about “revenue-positive bugs” where misbehavior increases token consumption at user expense.
- Others note that subscription plans can be dramatically cheaper than raw API use, pressuring users into first‑party harnesses.
Harness control and alternatives
- Frustration that managed agents don’t allow users to edit system prompts or harness logic, so they’re stuck with Anthropic’s guardrails and bugs.
- Multiple commenters prefer running their own harnesses (OpenCode, Pi, custom tools) with API keys or alternative models (OpenAI Codex, DeepSeek, Qwen).
- Subscription-based access often cannot be used with third‑party harnesses, limiting flexibility.
Safety vs user autonomy
- Debate over whether LLMs should strictly refuse anything possibly related to malware vs. acting as neutral tools.
- Some argue providers should allow powerful tools but handle abuse via detection/reporting, not intrusive guardrails.
- Others point out that trying to automate such guardrails can produce exactly this kind of overreach and regressions.
Perceptions of Anthropic’s engineering and product culture
- Many see repeated regressions, opaque prompts, and “vibe-coded” changes as signs of weak testing and poor product discipline.
- Some note Claude Code was ahead of the curve but appears to be degrading, with more bugs and inconsistent behavior.
- A minority suggest these problems may be growth pains rather than pure incompetence, but dissatisfaction dominates the thread.