Simplicity – Google SRE Handbook (2017)
Trust in Google & Alleged Hypocrisy
- Several commenters distrust Google’s advice due to:
- Account deletions (consumer and GCP), including the high‑profile pension fund incident.
- A perception that Google often deletes users/services once they become “inconvenient.”
- Others argue:
- Every large org makes mistakes; Google’s reliability is still high.
- Guidance can be sound even if not perfectly followed internally.
- Critiquing advice solely because Google wrote it is an ad hominem.
Applicability & Cargo Culting
- Many see the simplicity chapter as broadly good advice for anyone running services, from solo devs to large orgs.
- Strong warnings against:
- Copying “Google practices” into small companies and over‑engineering.
- Treating the SRE book as a universal manual rather than context‑specific essays.
- Some think the content is vague, feel‑good truisms that mainly fuel Google cargo culting and LinkedIn‑style virtue signaling.
Simplicity, Complexity, and Human Factors
- Debate over the book’s claim that emotional attachment to code drives complexity:
- Critics: main drivers are incentives, underinvestment in maintenance, difficulty measuring long‑term costs; blaming “emotions” becomes a thought‑terminating cliché.
- Supporters: emotional attachment, sunk cost, job security, and pride in complex systems are real and common.
- General agreement that:
- Complexity often accumulates via endless features and weak cleanup incentives.
- Simplicity requires continuous pruning, not rare giant rewrites.
- Discussion on:
- Overuse of feature flags/configs vs deleting code.
- Commented‑out “dead code” vs relying on version control; some find VCS‑based reversion hard in practice.
SRE Role, Culture, and Org Dynamics
- Some portray SRE organizations as:
- Philosophical, heavy‑handed, and sometimes contemptuous of developers.
- Mandating complexity themselves while criticizing SWE systems.
- Others note:
- SRE concepts (SLOs, error budgets, etc.) are metric‑focused, but metrics are not automatically “science.”
- Many companies label ops teams as “SRE” without giving them actual engineering authority, leading to dysfunction.
- Observations that internal politics (SRE as “guardian of simplicity” vs SWE as “complexity drivers”) can be unhealthy.
Google Cloud Incident & Backups
- Disagreement over the pension fund outage:
- Some assert Google deleted the entire account and all backups, recovery only possible via another provider.
- Others cite Google and customer statements saying GCS backups in the same region were intact and used, and that only VM configuration was deleted.
- Downtime of ~13 days is seen by some as incompatible with “rapid” restoration claims.