AWS outage shows internet users 'at mercy' of too few providers, experts say

Scale and Centralization of AWS

  • Commenters highlight how much traffic runs through AWS (and CloudFront/Cloudflare), arguing this concentrates systemic risk in a few “sheds in Virginia.”
  • Some see this as basic economics: low distribution cost → power-law winners (AWS/Azure/GCP).
  • Others note that many non-cloud options still exist (colo, bare metal, VPS), and centralization is as much lock‑in and marketing as pure technical merit.

Nature of the Outage (us-east-1)

  • Many stress it was not a total regional blackout: existing EC2/Fargate workloads mostly kept running; control planes and some “global” services failed.
  • IAM, STS, Lambda, SQS, DynamoDB, EC2 launches, and CloudWatch visibility were common pain points.
  • Several teams discovered hidden dependencies on us-east-1 endpoints (e.g., IAM), even for workloads in other regions.

Lock-In, Data Gravity, and Cost

  • Large datasets (terabytes to hundreds of terabytes in S3) are cited as the main practical lock-in, not compute.
  • Cross-region or multi-cloud replication is considered prohibitively expensive for many, especially due to storage and egress.
  • Some mention that competitors or AWS will sometimes eat egress fees for migrations, but ongoing duplication cost and complexity remain.

Multi-Cloud / Multi-Region Resilience

  • Broad agreement that true multi-cloud resilience is rare: cognitive overhead, provider differences, orchestration pain, and data consistency issues.
  • Cross-region designs are also hard: stateful systems, eventual consistency, and replay/merge of writes after failover.
  • Many companies consciously accept rare regional outages as a business tradeoff; others argue they misjudge risk and never properly test failover.

Containers and Cloud Lock-In

  • One view: Docker normalized “just ship a container and let the cloud handle storage/infra,” encouraging deeper reliance on proprietary services.
  • Counterview: containers are orthogonal to storage, reduce host-management toil, and actually make it easier to move between clouds or on-prem.

Alternatives and On-Prem

  • Some advocate VPS/local providers or colo to reduce correlated failures and costs, but acknowledge higher operational burden.
  • Others share that on-prem/colo setups often had more and longer outages due to limited in-house expertise and slower incident response.

Policy, “Experts,” and Systemic Risk

  • Several criticize media “experts” as non-technical policy or legal figures; others defend their role in assessing geopolitical/systemic dependence on foreign hyperscalers.
  • A recurring theme: AWS is likely still more reliable than most alternatives; the real issue is how customers architect and test their systems.