2025-03-27

It's five grand a day to miss our S3 exit

Scale and Hardware Footprint

Commenters note how much capacity fits in few racks now: Basecamp reportedly runs on ~8 racks across two DCs; some argue current and next‑gen CPUs could shrink that to ~1 rack per DC if density is prioritized.
There’s curiosity (and some concern) about power and cooling requirements at such high CPU and SSD density.

Cloud Costs, Overengineering, and Architecture

Many stories of cloud bills rivaling or exceeding large chunks of payroll (e.g., managed Postgres at $60k/month, RDBMS bills >$1M/month, AWS spend ≈50% of dev salaries).
Common causes cited: microservices sprawl (thousands of services), “just spin up more nodes” mentality, deeply abstracted ORM/OOP misuse leading to pathological query patterns.
Several argue that simple setups (single VPS, modest hardware) often scale surprisingly far (tens of thousands of daily users) compared to highly distributed, microservice-heavy systems.

Economics: Cloud vs Colo/Managed Hosting

Multiple people say that at moderate to large scale, cloud is “many multiples” more expensive than self‑hosting or rented servers, especially for bandwidth and storage.
One early spreadsheet analysis found ~3‑year break‑even for AWS vs self‑host; others think AWS prices are tuned to that horizon and to customers’ reluctance to plan beyond it.
Disagreement over cost models: one commenter’s LLM‑assisted estimate gives 8–18 years to break even on S3 repatriation; others argue their facility and power numbers are wildly inflated for 1–2 racks.
Cloud’s advantages highlighted: CAPEX avoidance, ability to scale (and especially to scale down to zero), global footprint, rich APIs, and managed services such as RDS and call‑center tooling.
Critics counter that many teams use clouds like overpriced VPS providers, not exploiting autoscaling, multi‑AZ, or advanced services, so they overpay without getting the benefits.

Labor, Skills, and Operational Complexity

Several note that cloud doesn’t eliminate ops work; it just shifts it to IAM, Terraform/Ansible/CDK stacks, debugging service integrations, and cost tuning.
Colocation/managed hosting is described as far smoother than a decade ago: remote hands, IPMI, PXE, and standardized automation narrow the gap with cloud.
A recurring theme is lack of on‑prem experience among younger engineers and management’s assumption that “cloud must be cheaper because it exists.”

Reliability and Redundancy

Some fear colo is less resilient; others respond that with RAID, standby DB nodes, redundant servers, and multiple ISPs, on‑prem setups can match or exceed practical cloud reliability.
Cloud reliability is also questioned: transient storage and networking issues, and widespread outages when a major region hiccups.

Storage Design and S3 Alternatives

Several stress that S3’s durability and features are not like‑for‑like with a single storage array; the move makes sense only if those extra guarantees aren’t needed.
Debate over SSD vs HDD: SSDs give performance and power benefits, but HDDs plus redundancy may be cost‑effective at this scale.
Some wonder whether intelligent tiering and cheaper S3 classes were fully exploited before deciding to exit.

Migration Tooling and Data Transfer

People ask what tool is used to evacuate S3; rclone is cited as successfully moving ~2 PB between DCs, with large files transferring efficiently over 10 Gbit/s.
AWS Snowball and egress‑waiver programs are mentioned; there’s irritation that large discounts often appear only when a customer threatens to leave.

Related topics