Don't trust AI agents

Containerization and Real Threats

  • Debate over whether Docker/Podman/containers are a “hard” security boundary: some point to multiple recent container escapes; others note many require strong pre-existing privileges.
  • Several argue that even perfect container security doesn’t fix the main risk: agents holding powerful third‑party credentials (Google, AWS, email). Exfiltrated tokens are far more valuable than rm -rf /.

Agent Permissions, Email, and Irreversible Actions

  • Many commenters think any agent with inbox access (even “read + draft only”) can still cause serious harm: password resets, magic links, forwarding reset emails, mass exfiltration, or subtle life manipulation (e.g., via reminders/todos).
  • Some conclude the only truly safe pattern is “read-only + queue suggestions for human approval,” which is closer to a webhook than an autonomous agent.
  • Others suggest this is inherently unsafe until prompt injection and non‑determinism are fundamentally solved.

Nanoclaw’s Model and Critiques

  • Nanoclaw’s pitch: small core, each agent in its own container, and “skills” that generate or merge in code on demand so users only get features they explicitly request.
  • Critics worry that:
    • Skills are effectively self‑modifying code guided by an LLM (with RNG), which may be less secure than a conventional plugin system.
    • Every install becomes a custom fork, complicating bug reproduction and updates.
  • The author positions Nanoclaw as a framework, not turnkey software: users are expected to review diffs and keep the codebase small enough to audit after each skill.

OpenClaw, Code Volume, and AI “Slopware”

  • Shock at OpenClaw’s claimed ~800k lines of TypeScript and thousands of issues/PRs, widely assumed to be largely LLM‑generated.
  • This triggers a long subthread on why LoC is a terrible metric, how AI encourages bloated “vibe coded” systems, and how verification and maintainability—not raw output—are what matter.
  • Some share positive anecdotes of using LLMs to rapidly build substantial systems, but emphasize that human review remains the real bottleneck.

Proposed Security Patterns and Their Limits

  • Suggested mitigations:
    • Treat agents like “enthusiastic juniors”: they draft, humans approve.
    • No direct secrets; use a hardened proxy/gateway to inject credentials and restrict network access (whitelists, time‑boxed domains, auditing).
    • Snapshot/revert for stateful agents; use VMs or microVMs rather than bare containers.
    • Limit agents to “recoverable” actions by default.
  • Others counter that GET-only tools aren’t truly safe (exfiltration via URLs/logs), proxies themselves can be prompt‑injected, and most schemes still assume non‑adversarial contexts.

Trust, Accountability, and Human vs AI

  • Comparison to contractors/employees: you don’t fully “trust” them either, but there are laws, contracts, and liability. With agents, accountability is unclear, and “just turn it off” is the only recourse.
  • Several argue current LLMs lack the judgment to be autonomous in sensitive domains; they’re useful as assistants, not unsupervised actors.

Use Cases and Questioning the Need

  • Reported personal uses: email triage, notes, reminders/goals, meeting transcription enrichment, GitHub/Jira cross‑referencing, simple home workflows, personal research.
  • Others openly question whether everyday life has enough real “friction” to justify the risk and maintenance burden of powerful autonomous agents, especially ones wired into critical accounts.