2025-11-18

Cloudflare Global Network experiencing issues

Outage scope and symptoms

Users worldwide report widespread 500/5xx errors from multiple Cloudflare POPs (London, Manchester, Warsaw, Sydney, Singapore, US, etc.), often with Cloudflare’s own error page explicitly blaming itself.
Behavior is flappy: services go up/down repeatedly over ~30–60 minutes; different regions and products (proxy, DNS, Turnstile, WARP, dashboard) are affected unevenly.
Many major sites and SaaS tools are down or degraded: X/Twitter, ChatGPT, Claude, Supabase, npmjs, uptime monitors, down-checker sites, some government and transport sites, and status pages themselves.
Cloudflare challenges/Turnstile failures block access and logins even to sites not otherwise proxied by Cloudflare, including Cloudflare’s own dashboard.

Speculation on root cause

Users speculate about:
- A control plane or routing/BGP issue propagating bad config globally.
- A DNS or network-layer failure (“Cloudflare Global Network” component shows as offline).
- Possible link to scheduled maintenance.
- A large DDoS (especially in light of recent Azure/AWS issues), though several point out there is no evidence yet; others expect a postmortem to clarify.
Some note WARP/Access-specific messages on the status page and wonder if internal routing or VPN-related changes backfired.

Status pages and communication

Status page lagged incident by tens of minutes; initially showed all green except minor items and maintenance, prompting criticism that status pages are “marketing” and legally constrained.
Others argue fully automated, accurate status pages at this scale are effectively impossible; a human always has to interpret noisy signals.

Developer experience and “phewphoria”

Many initially blamed their own deployments, restarted servers, or feared misconfigurations before discovering it was Cloudflare.
Discussion coins or refines a feeling of relief when it’s not your fault (“phewphoria”), but some prefer problems they caused themselves because they can at least fix them.
Management pressure and SLA expectations resurface; teams use global outages as leverage to justify redundancy work or to calm executives.

Centralization, risk, and tradeoffs

Strong concern that Cloudflare (plus AWS/Azure) has become a systemic single point of failure; outages now feel like “turning off the internet.”
Counterpoint: many small and medium sites need Cloudflare-like DDoS protection and bot filtering (especially against AI scrapers), and are still better off with occasional global CF outages than constant bespoke defense.
Debate over:
- Using Cloudflare as both registrar, DNS, and CDN (hard to escape during outages).
- Having fallbacks: alternative CDNs (e.g., Bunny), on-prem or VPS setups, multi-CDN/multi-cloud, separate status-page hosting.
- Whether most sites actually need Cloudflare versus simpler hosting, caching, and local WAFs.

Broader lessons

Outage reinforces:
- The fragility created by centralizing so much traffic and security behind one provider.
- The difficulty of avoiding single points of failure in practice, even for “multi-cloud” setups that still bottleneck through Cloudflare.
- The informal role of HN as a de facto, independent “status page” for major internet incidents.

Related topics