Don't trust AI agents
Containerization and Real Threats
- Debate over whether Docker/Podman/containers are a “hard” security boundary: some point to multiple recent container escapes; others note many require strong pre-existing privileges.
- Several argue that even perfect container security doesn’t fix the main risk: agents holding powerful third‑party credentials (Google, AWS, email). Exfiltrated tokens are far more valuable than
rm -rf /.
Agent Permissions, Email, and Irreversible Actions
- Many commenters think any agent with inbox access (even “read + draft only”) can still cause serious harm: password resets, magic links, forwarding reset emails, mass exfiltration, or subtle life manipulation (e.g., via reminders/todos).
- Some conclude the only truly safe pattern is “read-only + queue suggestions for human approval,” which is closer to a webhook than an autonomous agent.
- Others suggest this is inherently unsafe until prompt injection and non‑determinism are fundamentally solved.
Nanoclaw’s Model and Critiques
- Nanoclaw’s pitch: small core, each agent in its own container, and “skills” that generate or merge in code on demand so users only get features they explicitly request.
- Critics worry that:
- Skills are effectively self‑modifying code guided by an LLM (with RNG), which may be less secure than a conventional plugin system.
- Every install becomes a custom fork, complicating bug reproduction and updates.
- The author positions Nanoclaw as a framework, not turnkey software: users are expected to review diffs and keep the codebase small enough to audit after each skill.
OpenClaw, Code Volume, and AI “Slopware”
- Shock at OpenClaw’s claimed ~800k lines of TypeScript and thousands of issues/PRs, widely assumed to be largely LLM‑generated.
- This triggers a long subthread on why LoC is a terrible metric, how AI encourages bloated “vibe coded” systems, and how verification and maintainability—not raw output—are what matter.
- Some share positive anecdotes of using LLMs to rapidly build substantial systems, but emphasize that human review remains the real bottleneck.
Proposed Security Patterns and Their Limits
- Suggested mitigations:
- Treat agents like “enthusiastic juniors”: they draft, humans approve.
- No direct secrets; use a hardened proxy/gateway to inject credentials and restrict network access (whitelists, time‑boxed domains, auditing).
- Snapshot/revert for stateful agents; use VMs or microVMs rather than bare containers.
- Limit agents to “recoverable” actions by default.
- Others counter that GET-only tools aren’t truly safe (exfiltration via URLs/logs), proxies themselves can be prompt‑injected, and most schemes still assume non‑adversarial contexts.
Trust, Accountability, and Human vs AI
- Comparison to contractors/employees: you don’t fully “trust” them either, but there are laws, contracts, and liability. With agents, accountability is unclear, and “just turn it off” is the only recourse.
- Several argue current LLMs lack the judgment to be autonomous in sensitive domains; they’re useful as assistants, not unsupervised actors.
Use Cases and Questioning the Need
- Reported personal uses: email triage, notes, reminders/goals, meeting transcription enrichment, GitHub/Jira cross‑referencing, simple home workflows, personal research.
- Others openly question whether everyday life has enough real “friction” to justify the risk and maintenance burden of powerful autonomous agents, especially ones wired into critical accounts.