Regression: malware reminder on every read still causes subagent refusals

Bug and malware-prompt regression

  • Core issue: a system prompt in Claude’s managed agents / Claude Code says “whenever you read a file … you MUST refuse to improve or augment the code,” with no conditional “if it’s malware.”
  • Result: agents often refuse to modify any code after reading it, even when users clarify it isn’t malware.
  • Some users report they can override it by explicitly stating the code is not malware; others still see refusals.
  • The malware-analysis-on-every-file behavior is seen as wasteful, context-bloating, and poorly thought through.

Token usage, context bloat, and cost

  • Several commenters report large, unexplained token use: heavy hidden prompts, repeated malware checks, and verbose “thoughts.”
  • Some find managed harnesses double the context usage vs lighter custom harnesses (e.g., 30–40k vs ~15k tokens just to say “hi”).
  • People worry about “revenue-positive bugs” where misbehavior increases token consumption at user expense.
  • Others note that subscription plans can be dramatically cheaper than raw API use, pressuring users into first‑party harnesses.

Harness control and alternatives

  • Frustration that managed agents don’t allow users to edit system prompts or harness logic, so they’re stuck with Anthropic’s guardrails and bugs.
  • Multiple commenters prefer running their own harnesses (OpenCode, Pi, custom tools) with API keys or alternative models (OpenAI Codex, DeepSeek, Qwen).
  • Subscription-based access often cannot be used with third‑party harnesses, limiting flexibility.

Safety vs user autonomy

  • Debate over whether LLMs should strictly refuse anything possibly related to malware vs. acting as neutral tools.
  • Some argue providers should allow powerful tools but handle abuse via detection/reporting, not intrusive guardrails.
  • Others point out that trying to automate such guardrails can produce exactly this kind of overreach and regressions.

Perceptions of Anthropic’s engineering and product culture

  • Many see repeated regressions, opaque prompts, and “vibe-coded” changes as signs of weak testing and poor product discipline.
  • Some note Claude Code was ahead of the curve but appears to be degrading, with more bugs and inconsistent behavior.
  • A minority suggest these problems may be growth pains rather than pure incompetence, but dissatisfaction dominates the thread.