Siblings miss crucial life-extending treatment because of CrowdStrike outage
Scope of the incident
- Article describes two siblings missing a rare, life‑extending infusion at Seattle Children’s due to the CrowdStrike/Microsoft outage.
- Some commenters question whether this specific missed appointment was truly life‑threatening, noting the article itself mentions timing “wiggle room.”
- Others report local disruptions of varying severity (e.g., postponed prescriptions, long pharmacy lines, major hospitals in Boston curtailing operations).
Why care stops when IT stops
- Hospitals are now tightly coupled to electronic systems (notably Epic EHR). Staff rely on them for:
- Medication orders, allergies, and interaction checks.
- Access to imaging, lab results, and specialist reads.
- Logistics, scheduling, and inventory.
- Commenters stress that “just use paper” is unrealistic at scale; paper workflows are slower, unpracticed, and often not truly maintained.
- Liability and policy are major factors: staff fear being blamed for bypassing barcode scans or EHR procedures, even if technically capable of treating.
Pen, paper, and workarounds
- Some argue critical procedures should still be doable with manual backups, citing other sectors (forestry, smaller clinics, German practices) that function during outages.
- Others counter that:
- Modern medicine involves complex, computer‑dependent devices and calibrations.
- Regulations (e.g., CFR 21 Part 11) and vendor lock‑in make ad‑hoc reinstalling or bypassing systems impractical or forbidden.
- Organizational culture prioritizes compliance and job/insurance risk over individual initiative.
Responsibility and blame
- Many see primary fault in CrowdStrike for having a de‑facto kill switch that bricked endpoints globally.
- Others emphasize shared blame:
- Hospitals and IT for brittle architectures, lack of staged rollouts, and inadequate disaster planning.
- Regulators, auditors, cyber‑insurance, and procurement practices that effectively mandate kernel‑level EDR on near‑life‑critical systems.
- CrowdStrike’s own terms of use explicitly disclaim use in direct or indirect life‑support contexts; debate over how meaningful or enforceable this is.
Architecture, QA, and systemic risk
- Discussion of CrowdStrike’s update model: an automatically pushed “channel” data file bypassed normal staging and caused boot loops.
- Some point to cost‑cutting on QA (outsourcing, under‑resourcing) and lack of staged deployment as key process failures.
- Broader theme: over‑centralized, highly optimized digital systems lack resilience; when they fail, hospitals can’t easily fall back to safe, slower modes of operation.