How to stop AI's "lethal trifecta"
Engineering mindset vs software practice
- Thread picks up on the article’s call for AI engineers to “think like engineers” and extends it to all software involved in physical or high‑impact systems.
- Commenters outline what “real engineering” would look like in software:
- Design for explicit failure modes; assume changes could bankrupt you or send you to jail.
- Apply concepts like safety factors, redundancy, and margins—even to code and ML models.
- Use repeatable processes and professional standards, not “move fast and break things.”
Nature of software and AI systems
- Several people contrast software with bridges: software components are poorly characterized “materials,” highly mutable, and deeply entangled with shifting dependencies (libraries, OS, networks).
- Others note that physical infrastructure also needs continuous maintenance; the real difference is extreme mutability and repurposing, which encourages unsafe redesign-on-the-fly.
- LLMs add another twist: they blur data vs instructions and often behave non‑repeatably under load or temperature, undermining classic safety assumptions.
The “lethal trifecta” and prompt injection
- The trifecta—access to untrusted data, access to secrets, and an exfiltration channel—is seen as essentially “security 101,” but hard to avoid in attractive use cases (email agents, workflow automation).
- Strong view: if all three are present, the system is fundamentally insecure; mitigation is to “cut off a leg,” usually exfiltration.
- Others argue even two legs can be disastrous (destructive actions, data corruption without exfiltration).
- Prompt injection is likened to giving an easily social‑engineered intern root and letting anyone talk to them.
Security controls, limits, and trade‑offs
- Suggested defenses: strict access controls, offline data, no arbitrary outbound network, human‑in‑the‑loop approvals, sandboxing the agent’s OS account.
- Many criticize current “guardrail/filter” products as giving false confidence.
- The CaMeL approach (separating “trusted” and “untrusted” models with constrained code interfaces) is viewed as promising but complex and capability‑reducing.
- Tension is repeatedly noted between safety and the powerful, unified-context agents that businesses want.
Determinism, trust, and human analogies
- Long subthread on whether LLM non‑determinism matters: technically outputs can be made reproducible, but from a security standpoint they must be treated as unpredictable and unprovable.
- Some argue we “already know” how to do security for deterministic systems; others say AI breaks those assumptions, especially because you can’t reliably separate code from data.
- LLMs are compared to non‑deterministic, easily phished humans—but with no accountability and at far greater scale.
Data breaches and lethality
- One commenter downplays data breaches as non‑lethal; others push back with examples where exposed military, political, or personal data plausibly leads to physical harm or major financial damage.
- Consensus: breaches can be part of lethal scenarios, especially combined with AI‑driven exploitation.
Critiques of the Economist framing
- Some praise the earlier, longer article as a clear mainstream explanation of prompt injection, but see the leader as weaker.
- Specific complaints:
- The bridge analogy is strained; real engineers remove dangerous failure modes rather than “over‑engineering around” them.
- Claim that non‑deterministic systems need non‑deterministic safety approaches is called a non sequitur.
- Overall tone (“coders need to…”) and reliance on contrived analogies are viewed as oversimplifying very hard, possibly unsolvable classes of problems.