2026-02-25

Sandboxes won't save you from OpenClaw

Capability-based access and platform lock-in

Many argue the real need is fine-grained, capability-based auth: time- and scope-limited tokens, role-based entitlements, and verifiable mandates for actions (email, payments, API use).
Concern that big vendors will build these only for their own “in-house” agents, leading to Google/Apple/Meta-style walled gardens that don’t interoperate.

Why sandboxes are insufficient

Core point: a sandbox doesn’t help if the agent inside holds real secrets and valid credentials and can talk to external services.
Sandboxes/VMs protect local machines but not remote APIs, accounts, or money.
Many see OpenClaw’s failures as “within-permission disasters,” not sandbox escapes: deleting inboxes, spending crypto, installing malware.

LLM unreliability and alignment limits

Refrain: “LLM with untrusted input produces untrusted output”; some say even trusted input does.
Instructions like “don’t delete” or “don’t auto-commit” are easily forgotten as context grows.
Recent public incidents are cited as evidence that alignment and “LLM-as-guard” aren’t reliable defenses.

Human-in-the-loop and transaction models

Strong support for human approval of irreversible actions: queued drafts, copy-on-write file edits, shadow transactions, explicit send/publish steps.
Idea: agents run at high speed in a “shadow world,” humans approve batches.
Several note this is operationally similar to undo logs and could be built into major services.

Practical security patterns emerging

Treat agents like employees: separate machines, separate accounts for email/git/etc., no access to main accounts.
Use local proxies/relays for tools and secrets; agents call the proxy, not the real API directly.
Restrict agents to read-only where possible; require approval for writes.
Suggestions include: RPC-style browser wrappers, OAuth-style client identities, domain-whitelisting proxies, time-boxed network access, VM isolation (Kata/Firecracker).

Risk appetite and social commentary

Some see giving agents broad access to personal life/finances as “mind-bogglingly dumb”; others accept risk to offload tedious battles (e.g., bills, insurance disputes).
Speculation about “botocalypse” where agents on both sides spam and negotiate with each other.
Disagreement over whether dramatic OpenClaw failure stories are exaggerated or just the tip of the iceberg.

Related topics