It's five grand a day to miss our S3 exit
Scale and Hardware Footprint
- Commenters note how much capacity fits in few racks now: Basecamp reportedly runs on ~8 racks across two DCs; some argue current and next‑gen CPUs could shrink that to ~1 rack per DC if density is prioritized.
- There’s curiosity (and some concern) about power and cooling requirements at such high CPU and SSD density.
Cloud Costs, Overengineering, and Architecture
- Many stories of cloud bills rivaling or exceeding large chunks of payroll (e.g., managed Postgres at $60k/month, RDBMS bills >$1M/month, AWS spend ≈50% of dev salaries).
- Common causes cited: microservices sprawl (thousands of services), “just spin up more nodes” mentality, deeply abstracted ORM/OOP misuse leading to pathological query patterns.
- Several argue that simple setups (single VPS, modest hardware) often scale surprisingly far (tens of thousands of daily users) compared to highly distributed, microservice-heavy systems.
Economics: Cloud vs Colo/Managed Hosting
- Multiple people say that at moderate to large scale, cloud is “many multiples” more expensive than self‑hosting or rented servers, especially for bandwidth and storage.
- One early spreadsheet analysis found ~3‑year break‑even for AWS vs self‑host; others think AWS prices are tuned to that horizon and to customers’ reluctance to plan beyond it.
- Disagreement over cost models: one commenter’s LLM‑assisted estimate gives 8–18 years to break even on S3 repatriation; others argue their facility and power numbers are wildly inflated for 1–2 racks.
- Cloud’s advantages highlighted: CAPEX avoidance, ability to scale (and especially to scale down to zero), global footprint, rich APIs, and managed services such as RDS and call‑center tooling.
- Critics counter that many teams use clouds like overpriced VPS providers, not exploiting autoscaling, multi‑AZ, or advanced services, so they overpay without getting the benefits.
Labor, Skills, and Operational Complexity
- Several note that cloud doesn’t eliminate ops work; it just shifts it to IAM, Terraform/Ansible/CDK stacks, debugging service integrations, and cost tuning.
- Colocation/managed hosting is described as far smoother than a decade ago: remote hands, IPMI, PXE, and standardized automation narrow the gap with cloud.
- A recurring theme is lack of on‑prem experience among younger engineers and management’s assumption that “cloud must be cheaper because it exists.”
Reliability and Redundancy
- Some fear colo is less resilient; others respond that with RAID, standby DB nodes, redundant servers, and multiple ISPs, on‑prem setups can match or exceed practical cloud reliability.
- Cloud reliability is also questioned: transient storage and networking issues, and widespread outages when a major region hiccups.
Storage Design and S3 Alternatives
- Several stress that S3’s durability and features are not like‑for‑like with a single storage array; the move makes sense only if those extra guarantees aren’t needed.
- Debate over SSD vs HDD: SSDs give performance and power benefits, but HDDs plus redundancy may be cost‑effective at this scale.
- Some wonder whether intelligent tiering and cheaper S3 classes were fully exploited before deciding to exit.
Migration Tooling and Data Transfer
- People ask what tool is used to evacuate S3; rclone is cited as successfully moving ~2 PB between DCs, with large files transferring efficiently over 10 Gbit/s.
- AWS Snowball and egress‑waiver programs are mentioned; there’s irritation that large discounts often appear only when a customer threatens to leave.