2025-09-26

How to stop AI's "lethal trifecta"

Engineering mindset vs software practice

Thread picks up on the article’s call for AI engineers to “think like engineers” and extends it to all software involved in physical or high‑impact systems.
Commenters outline what “real engineering” would look like in software:
- Design for explicit failure modes; assume changes could bankrupt you or send you to jail.
- Apply concepts like safety factors, redundancy, and margins—even to code and ML models.
- Use repeatable processes and professional standards, not “move fast and break things.”

Nature of software and AI systems

Several people contrast software with bridges: software components are poorly characterized “materials,” highly mutable, and deeply entangled with shifting dependencies (libraries, OS, networks).
Others note that physical infrastructure also needs continuous maintenance; the real difference is extreme mutability and repurposing, which encourages unsafe redesign-on-the-fly.
LLMs add another twist: they blur data vs instructions and often behave non‑repeatably under load or temperature, undermining classic safety assumptions.

The “lethal trifecta” and prompt injection

The trifecta—access to untrusted data, access to secrets, and an exfiltration channel—is seen as essentially “security 101,” but hard to avoid in attractive use cases (email agents, workflow automation).
Strong view: if all three are present, the system is fundamentally insecure; mitigation is to “cut off a leg,” usually exfiltration.
Others argue even two legs can be disastrous (destructive actions, data corruption without exfiltration).
Prompt injection is likened to giving an easily social‑engineered intern root and letting anyone talk to them.

Security controls, limits, and trade‑offs

Suggested defenses: strict access controls, offline data, no arbitrary outbound network, human‑in‑the‑loop approvals, sandboxing the agent’s OS account.
Many criticize current “guardrail/filter” products as giving false confidence.
The CaMeL approach (separating “trusted” and “untrusted” models with constrained code interfaces) is viewed as promising but complex and capability‑reducing.
Tension is repeatedly noted between safety and the powerful, unified-context agents that businesses want.

Determinism, trust, and human analogies

Long subthread on whether LLM non‑determinism matters: technically outputs can be made reproducible, but from a security standpoint they must be treated as unpredictable and unprovable.
Some argue we “already know” how to do security for deterministic systems; others say AI breaks those assumptions, especially because you can’t reliably separate code from data.
LLMs are compared to non‑deterministic, easily phished humans—but with no accountability and at far greater scale.

Data breaches and lethality

One commenter downplays data breaches as non‑lethal; others push back with examples where exposed military, political, or personal data plausibly leads to physical harm or major financial damage.
Consensus: breaches can be part of lethal scenarios, especially combined with AI‑driven exploitation.

Critiques of the Economist framing

Some praise the earlier, longer article as a clear mainstream explanation of prompt injection, but see the leader as weaker.
Specific complaints:
- The bridge analogy is strained; real engineers remove dangerous failure modes rather than “over‑engineering around” them.
- Claim that non‑deterministic systems need non‑deterministic safety approaches is called a non sequitur.
- Overall tone (“coders need to…”) and reliance on contrived analogies are viewed as oversimplifying very hard, possibly unsolvable classes of problems.

Related topics