Migrating Uber's ledger data from DynamoDB to LedgerStore
Scale and Data Model
- Commenters unpack “1 trillion records” as ledger records, not user-visible trips; a single ride/order can generate many entries: fares, fees, tips, taxes, refunds, subscriptions, driver payouts, disputes, etc.
- Trillions of index entries are mentioned; some infer heavy de-normalization and multi‑party accounting (rider, driver, restaurant, taxes).
- Debate over whether Uber’s total trip count and per-transaction record count make the trillion figure plausible; consensus in the thread leans toward “yes, plausible.”
Cost Savings and ROI Debate
- Headline $6M/year savings draws skepticism: some view it as small relative to Uber’s scale; others argue recurring savings are high-value and can justify large one‑time investments.
- Several try to estimate headcount and compensation; rough math suggests a non-trivial portion of the savings could be consumed by development and ongoing maintenance.
- Opportunity cost is raised: could those engineers have generated more value elsewhere vs. cost-saving infra work?
Build-vs-Buy and Cloud Dependence
- Many note DynamoDB’s high cost, even when used “correctly” as a key-value store. Some see this migration as evidence that proprietary cloud databases get very expensive at scale.
- Others emphasize advantages of offloading ops to AWS and question taking on custom DB on-call, firmware, and hardware concerns.
- Lock‑in vs. migration cost is debated: some value moving off AWS primitives; others point out that any large-scale migration (even between VMs) is extremely expensive and risky.
Technical Architecture and Alternatives
- Multiple suggestions: DynamoDB + Redshift or data warehouse tiering; parquet on S3; hot/cold architectures; MySQL/Postgres/Spanner-like systems; TigerBeetle, QLDB.
- A long subthread rejects “just use SQLite on a huge box” due to file size limits, single-writer constraints, replication/backup complexity, and availability concerns.
Data Retention and Compliance
- Questioning why so much historical payment data is kept online; replies cite regulatory retention (often ~10 years), financial/audit requirements, and fear of deletion bugs in money systems.
- Soft-delete / active–inactive flags are described as common; actual deletion is rare.
Startup Spin-Off: HaystackDB
- A founder of a write-optimized datastore joins the discussion, seeking customers.
- Feedback: need enterprise sales, clearer positioning, technical whitepapers, benchmarks, and more convincing pricing (reads seen as too expensive).
- Several urge focusing on a narrow, must-have niche and possibly open-source components to build trust.