2025-04-29

Oracle engineers caused five days software outage at U.S. hospitals

Why organizations still buy Oracle / Cerner

Several comments argue Oracle wins through aggressive enterprise sales: courting executives/CTOs with “we’ll handle everything” pitches, not developer preference.
Others note in this case “Oracle wasn’t bought, Cerner was” and that Cerner’s core products historically sat on Oracle backends.
Oracle is seen as offering a full-stack menu (DB, ERP, identity, CRM, cloud, etc.) that appeals to large bureaucracies wanting one vendor and “one throat to choke.”
Some mention educational seeding: universities teaching Java/Oracle stacks create a pipeline of Oracle-literate juniors.

Legacy lock‑in, mainframes, and COBOL

Many see Oracle use as legacy lock‑in: systems built decades ago when alternatives were weaker, now too risky/expensive to replace.
Comparisons to COBOL/mainframes: systems run for 40–60 years, deeply embedded in business processes; migration is huge and rarely justified if the old system “still works.”
Discussion on COBOL careers: some say it’s a high-pay, long-term niche; others call it “zombieware” that young devs avoid.
A few suggest AI-assisted codebase translation as an unexplored opportunity, but others note the complexity and risk.

Technical views on Oracle vs Postgres/SQL Server

Strong split: many insist Postgres is better for 99% of use cases; others say if money is no object, OracleDB still wins for extreme scale and fine-grained control.
Examples cited: Oracle’s partitioned global unique indexes, ability to pin/prioritize execution plans, Exadata storage-level optimizations.
Postgres’s refusal to fully honor query hints is highlighted as a pain point in some mission-critical scenarios.
Some note OracleDB is technically impressive but surrounded by awful licensing, audits, tooling, and operational complexity.
Others argue Oracle quality is generally low across its vast product line, even if the core database engine is strong.

Cause of the outage: human error vs process failure

The reported root cause (“engineers deleted critical storage”) leads many to blame poor change management rather than individual engineers.
Several describe what good process should look like in healthcare: strict procedures, staged disablement, read‑only aging, delayed physical deletion, clear rollback and recovery plans.
The multi-day recovery time is read as evidence that procedures, safeguards, and tested backups were inadequate.

Culture, blame, and management pressure

Debate around whether such failures stem from unrealistic deadlines and executive pressure vs plain incompetence or bad ops hygiene.
Some argue it’s a pattern: decisions and budgets arrive late, but delivery dates don’t move, forcing compressed, risky work.
Others push back on reflexively blaming management, emphasizing that individuals and organizations both share responsibility.

LLMs, “vibe coding,” and reliability

Thread digresses into AI-assisted “vibe coding”: rapid progress initially but poor architecture, weak understanding, and fragile prototypes.
Experienced developers report that heavy LLM reliance can degrade learning and insight; LLMs are seen as powerful search/idea tools, not mentors or safety nets.
Concern that novices using LLMs may ship systems they don’t understand deeply enough to debug safely in critical environments.

Healthcare / EHR specifics and Cerner design

Cerner’s architecture is criticized: shared “multitenant” database setups for multiple hospitals and high access privileges (e.g., widespread SSH and production DB write access).
Multiple EHRs (Cerner, Epic, others) are described as dreadful from clinician and operational perspectives, even when technically “up.”
Some note regional regulatory constraints (e.g., strict privacy rules) making generic EHR products hard to adapt.
Skepticism that Oracle’s future AI-based EHR will prioritize real quality over marketing.

Related topics