No More Blue Fridays

eBPF as an Alternative to Kernel Drivers

  • Many commenters agree that replacing third‑party kernel modules with eBPF-based code would reduce the chance of system-wide crashes, especially for security/observability tools.
  • Benefit: one shared, heavily-scrutinized verifier and runtime instead of many vendors shipping their own fragile kernel code.
  • Several note current Linux support is solid on modern/LTS kernels; some enterprise distros backport eBPF to older bases.

Safety, Verifier, and the Halting Problem

  • eBPF programs are statically checked: bounded loops only, strict memory access rules, limited helper APIs.
  • Multiple comments emphasize eBPF is not Turing-complete; termination is enforced partly by forbidding unprovable loops and by instruction limits.
  • The verifier is large (~20k LOC) and complex. Some see this as rigor; others see a big attack surface and hard-to-audit code.
  • Clarification: the verifier guarantees safety only if there are no bugs in the verifier or the underlying helpers.

Limits and Remaining Failure Modes

  • Several point out past kernel panics triggered via eBPF paths, including by security products, so “immune to crashes” is considered marketing overreach.
  • Even if the kernel doesn’t crash, bad eBPF or rulesets can still effectively DoS a machine (e.g., overblocking, resource exhaustion).
  • eBPF can’t replace all kernel code (e.g., full device/graphics drivers); it’s mostly suitable for instrumentation, filtering, and some enforcement.

Windows, ETW, and Ecosystem Questions

  • Windows eBPF support is currently limited (mostly networking hooks). Commenters doubt it can yet replace complex kernel-resident security drivers like ELAM.
  • Some argue Windows already has ETW and file-system filter frameworks; performance and coverage, not lack of hooks, are major constraints.
  • Others expect more eBPF hooks over time but see full parity with Linux as “years away.”

CrowdStrike Outage, Testing, and Social Factors

  • Strong debate around canary/staged rollouts for AV/EDR updates:
    • One side: industry-standard and would have greatly reduced blast radius.
    • Other side: security vendors are pressured by SLAs/MTTD and fear customer backlash if some systems get protections later.
  • Several argue no technical mechanism (eBPF, Rust, formal methods) can replace organizational discipline, robust QA, and sane deployment practices.
  • Broader critique: OS vendors, especially for Windows, should reduce kernel extensibility or offer safer, mandatory interfaces rather than rely on third-party kernel code.