CrowdStrike accepting the PwnieAwards for "most epic fail" at defcon

Reactions to accepting the PwnieAward

  • Many see showing up and accepting the “most epic fail” award as the least-bad PR option: refusing would look evasive, attending allows public contrition and a reminder to staff.
  • Others call it “tone deaf” and trivializing a catastrophe; they view it as laughing off a disaster that caused global disruption.
  • Some note the acceptance speech came across as sober and self‑critical, not jokey; critics respond that context (a fun con talk, applause, trophy) makes it inappropriate regardless of tone.

Human impact and seriousness of the outage

  • Commenters describe severe real‑world impact: grounded flights, hospital and ER disruptions, 911 outages, pharmacy issues, lost business and productivity.
  • Debate over deaths: some are “certain” people died indirectly (delayed care, emergency stress), others say no concrete evidence has surfaced and stress that hospitals have downtime procedures.
  • Several note that even “just” elevated stress and missed life events (funerals, last goodbyes, surgeries) are serious harms.

Liability, lawsuits, and contracts

  • Many ask why there are few visible lawsuits given claimed losses in the billions.
  • CrowdStrike contracts reportedly cap liability to low millions; some argue that won’t withstand “gross negligence” claims, especially from insurers.
  • Delta’s suit and CS’s public response are discussed: CS points to contractual caps, hints at aggressive discovery into Delta’s IT practices, and suggests Delta’s prolonged outage was partly its own fault.
  • Some expect insurers and reinsurers to be the main drivers of any serious reckoning, e.g., by surcharging or refusing coverage when CS is in the stack.

Responsibility: CrowdStrike vs customers

  • Strong consensus that CS’s process was egregious: an update that crashes essentially 100% of target Windows systems implies fundamental testing and rollout failures.
  • Key detail: the “rapid response” update apparently bypassed customers’ usual staged rollout controls, leaving them unable to canary it.
  • Others argue enterprises also bear blame for:
    • Allowing a third‑party kernel driver to be a single point of failure on critical systems.
    • Not designing fallback procedures and “analog” continuity plans robust enough for such outages.
    • Over‑relying on cloud and endpoint tools to satisfy auditors and insurers, not genuine risk analysis.

Software vs civil engineering and calls for accountability

  • Large sub‑thread compares software to civil engineering:
    • One side: bridges have clear standards, licensing, and personal liability; software should evolve similar norms, especially for life‑critical systems.
    • Opposing view: software changes too fast, is vastly more complex, and is attacked continuously; perfect safety is impossible and over‑regulation would cripple competitiveness.
  • Some advocate for a professionalized “real engineering” tier with licenses and sign‑off liability for safety‑critical code; others warn it would mostly create rent‑seeking gatekeepers and push innovation offshore.

Security tooling, SPOFs, and industry incentives

  • Many criticize the entire model of managed endpoint security:
    • Closed‑source kernel code parsing untrusted input is seen as inherently dangerous.
    • Centralized products that can remotely brick all endpoints are called “security single points of failure.”
  • Commenters note that many organizations deploy such tools mainly to tick compliance/insurance boxes; the risk of catastrophic vendor failure was underappreciated.
  • Some argue that if a system is truly life‑critical, running networked Windows with third‑party kernel agents is itself negligent, regardless of CS’s bug.

What consequences should follow

  • Views range from:
    • “Nuke the company” / bankrupt and reconstitute it as a warning,
    • To “fix the processes, don’t scapegoat individuals,” similar to how some large outages at other providers were handled.
  • Skeptics doubt meaningful change will occur without:
    • Legal liability that survives EULAs and caps.
    • Insurance pressure that makes unsafe stacks uninsurable.
    • Cultural shift away from “move fast and break things” toward genuine engineering discipline.