Upgrading Uber's MySQL Fleet
Version choice: MySQL 8.0 vs 8.4
- Several commenters ask why Uber upgraded to 8.0 (old LTS) instead of 8.4 (current LTS).
- Answers from the thread:
- Project started in 2023, before 8.4 existed (released April 2024).
- Direct upgrade 5.7 → 8.4 is not supported; 8.0 is a required step.
- 8.0 is seen as more “battle-tested”; 8.4 likely has more unknowns.
- Some expect that 8.0 → 8.4 later will be relatively cheap now that tooling and processes are in place.
Stability, bugs, and MySQL vs MariaDB
- Strong disagreement over MySQL’s stability:
- Some report years of trouble‑free use at banks and large companies.
- Others describe serious bugs, especially in newer or less‑used features (e.g., table rename issues, InnoDB full‑text, MVIs, regex, JSON).
- MySQL 8.0 is criticized for frequent behavioral changes in patch releases.
- MariaDB is suggested as an alternative, but:
- Commenters note it is no longer a true drop‑in replacement; significant DDL and replication differences.
- Some major projects now explicitly do not support MariaDB.
Postgres, VACUUM, and MVCC tradeoffs
- Big subthread comparing MySQL vs Postgres:
- MySQL praised as “smooth” and low‑maintenance for some high‑churn use cases.
- Multiple complaints about Postgres VACUUM being a resource hog and operationally painful at scale, especially with:
- Very large, heavily updated tables.
- Extremely high table counts (hundreds of thousands per DB).
- Others counter that:
- Modern autovacuum is much improved; tuning per‑table and sharding help.
- Routine VACUUM (not VACUUM FULL) is usually fine; problems come from misconfig or edge workloads.
- Explanations given of MySQL’s in‑place updates vs Postgres’ heap + dead tuple cleanup.
- Consensus: Postgres can run very well, but demands more DBA expertise and ongoing maintenance than MySQL.
Scale, architecture, and capacity utilization
- Uber’s numbers (≈3M QPS, 2.1K clusters, 16K nodes) prompt debate:
- Simple averaging gives ~200 QPS per node or ~1.4K QPS per cluster, which some see as low and potentially overprovisioned.
- Others argue this division is meaningless:
- Load is highly uneven across clusters, regions, and times of day.
- Mix of primaries/replicas and differing workloads makes “QPS per node” a poor metric.
- Some speculate that their architecture may be expensive relative to alternatives (e.g., DynamoDB, KV stores), but acknowledge lack of visibility into schema and query patterns.
Password rotation and MySQL authentication
- Commenters like MySQL 8’s dual-password feature for smooth credential rotation vs painful “big bang” changes.
- Thread notes:
mysql_native_passwordis deprecated in 8.0 and disabled by default in 8.4, but can be re‑enabled.- Future MySQL 9.0 will require disabling it entirely.
- Migrating off old drivers and auth methods may surprise lagging applications.
Kubernetes and running databases
- Question raised whether containerizing the DB layer (e.g., on Kubernetes) would have simplified Uber’s upgrade.
- Multiple replies argue “no”:
- Most upgrade complexity is in app logic, query behavior, regressions, and config changes — unrelated to container orchestration.
- Running large stateful DBs on k8s is described as messy; k8s itself struggles at very large node counts, and some operators (e.g., CNI) don’t scale well past a few thousand nodes.
Cloud vs self‑managed and migration stories
- Some describe similar 5.7 → 8.0 upgrades on managed services (e.g., Aurora), often using cross‑version replication and blue‑green cutovers with good results.
- Others note:
- Managed MySQL still requires version upgrades; you just outsource some mechanics.
- For truly “never upgrade MySQL yourself,” you’d need a different class of service (e.g., fully managed horizontally scalable DBs), not just hosted MySQL.
LLM‑like writing style debate
- Large subthread on whether the Uber blog post was written or “sanitized” by an LLM:
- Indicators cited: over‑formal tone, heavy adjectives, words like “delve,” “compelling,” “seamless,” “embark,” repeated structure, and list‑like sections.
- Others argue this is just typical corporate/marketing or non‑native (e.g., Indian) English, not necessarily AI.
- Some worry that people are becoming overconfident in spotting AI, leading to false accusations and pressure to oversimplify writing.
Driver safety and business priorities (off‑topic tangent)
- A side discussion criticizes Uber for focusing on infra upgrades while users report dangerous drivers.
- Counterpoints:
- Company behavior is framed as optimizing shareholder value, with legal tools like arbitration reducing liability.
- Others argue safety is (or should be) a core part of long‑term shareholder value regardless.