2025-10-12

Three ways formally verified code can go wrong in practice

Scope of Formal Verification vs. Hardware Reality

Multiple comments stress that software proofs don’t eliminate hardware failures: RAM bit flips, CPU misbranches, and other soft errors still occur.
Formal methods can be applied to hardware too (e.g., ECC RAM designs), but they still rely on probabilistic assumptions about error rates, not absolute guarantees.
Safety‑critical systems use redundancy (lockstep processors, TMR, multiple independent controllers, spatial/orientation separation of computers) to tolerate faults not modeled at the software level.
There’s pushback against modeling “cosmic‑ray GOTO after every statement” in software; you must draw a line on what’s assumed vs. mitigated by hardware/redundancy.

Assumptions, Assertions, and Runtime Checking

Debate over whether formally verified code should assert its assumptions at runtime.
Objections:
- Crashing on violated assumptions is often exactly what the verification aimed to avoid.
- Many assumptions (correct compiler, no data races, correct hardware, no OS interference) are not checkable in code.
- Checking some assumptions (e.g., array sortedness before binary search) is prohibitively expensive.
Clarification around “assert” vs. “verify”:
- “Verify” = actually check (possibly expensive).
- “Assert” = state as true, maybe checked, maybe not; in many languages, it’s just “crash if false,” sometimes even compiled out.
C/C++ example: signed overflow is undefined behavior, so compilers may optimize away explicit overflow checks, undermining naive use of asserts.
Some advocate partial/cheap checks (sanity checks, invariants, bounds checks) or logging instead of crashing, especially when full proofs are absent.

Verification vs. Validation and Environment Modeling

Recurrent distinction:
- Verification: “Are we building it right?” (meets spec).
- Validation: “Are we building the right thing?” (meets real‑world needs/requirements).
Others use “verification” for formal proofs and “validation” for empirical/testing‑based assurance.
Several note that many interesting properties depend on the environment (OS, hardware, user behavior). Fully formalizing that environment is viewed as practically impossible.
One thread pushes an epistemological view: all testing is a form of formal method with implicit assumptions; formal verification just makes those assumptions explicit.
Neural‑network‑based definitions (e.g., “rose detector”) are proposed as a way to formalize fuzzy goals; critics argue this just moves uncertainty into the NN and cannot be fully proved.

Practical Value and Limits of Formal Methods

Commenters describe domains (finance, legal workflows, safety‑critical control) where errors are extremely costly, motivating proofs for selected components.
Others emphasize diminishing returns: use proofs for critical invariants, strong typing and assertions elsewhere, and accept residual risk.
Wrong or incomplete specs (e.g., ignoring integer overflow in a “proved correct” algorithm) are highlighted as a core failure mode not solved by the tooling itself.

Related topics