Cloudflare Global Network experiencing issues

Outage scope and symptoms

  • Users worldwide report widespread 500/5xx errors from multiple Cloudflare POPs (London, Manchester, Warsaw, Sydney, Singapore, US, etc.), often with Cloudflare’s own error page explicitly blaming itself.
  • Behavior is flappy: services go up/down repeatedly over ~30–60 minutes; different regions and products (proxy, DNS, Turnstile, WARP, dashboard) are affected unevenly.
  • Many major sites and SaaS tools are down or degraded: X/Twitter, ChatGPT, Claude, Supabase, npmjs, uptime monitors, down-checker sites, some government and transport sites, and status pages themselves.
  • Cloudflare challenges/Turnstile failures block access and logins even to sites not otherwise proxied by Cloudflare, including Cloudflare’s own dashboard.

Speculation on root cause

  • Users speculate about:
    • A control plane or routing/BGP issue propagating bad config globally.
    • A DNS or network-layer failure (“Cloudflare Global Network” component shows as offline).
    • Possible link to scheduled maintenance.
    • A large DDoS (especially in light of recent Azure/AWS issues), though several point out there is no evidence yet; others expect a postmortem to clarify.
  • Some note WARP/Access-specific messages on the status page and wonder if internal routing or VPN-related changes backfired.

Status pages and communication

  • Status page lagged incident by tens of minutes; initially showed all green except minor items and maintenance, prompting criticism that status pages are “marketing” and legally constrained.
  • Others argue fully automated, accurate status pages at this scale are effectively impossible; a human always has to interpret noisy signals.

Developer experience and “phewphoria”

  • Many initially blamed their own deployments, restarted servers, or feared misconfigurations before discovering it was Cloudflare.
  • Discussion coins or refines a feeling of relief when it’s not your fault (“phewphoria”), but some prefer problems they caused themselves because they can at least fix them.
  • Management pressure and SLA expectations resurface; teams use global outages as leverage to justify redundancy work or to calm executives.

Centralization, risk, and tradeoffs

  • Strong concern that Cloudflare (plus AWS/Azure) has become a systemic single point of failure; outages now feel like “turning off the internet.”
  • Counterpoint: many small and medium sites need Cloudflare-like DDoS protection and bot filtering (especially against AI scrapers), and are still better off with occasional global CF outages than constant bespoke defense.
  • Debate over:
    • Using Cloudflare as both registrar, DNS, and CDN (hard to escape during outages).
    • Having fallbacks: alternative CDNs (e.g., Bunny), on-prem or VPS setups, multi-CDN/multi-cloud, separate status-page hosting.
    • Whether most sites actually need Cloudflare versus simpler hosting, caching, and local WAFs.

Broader lessons

  • Outage reinforces:
    • The fragility created by centralizing so much traffic and security behind one provider.
    • The difficulty of avoiding single points of failure in practice, even for “multi-cloud” setups that still bottleneck through Cloudflare.
    • The informal role of HN as a de facto, independent “status page” for major internet incidents.