2026-02-02

GitHub experience various partial-outages/degradations

Azure outage and root cause

Multiple comments link GitHub’s partial outage to an ongoing Azure incident affecting VM management operations (create/update/scale/start/stop) across several regions.
Azure’s own status cites a misconfiguration: a change to storage account ACLs hosting VM extensions broke public access, impacting Azure DevOps, GitHub, and others.
Users report GitHub Actions failing, self‑hosted runners unable to scale, and jobs stuck in queues while minutes continue to be consumed.

GitHub reliability and Azure migration concerns

Several see this as part of a broader pattern: “monthly” or even “daily” GitHub incidents, with January cited as having an incident count roughly equal to the number of days.
Some argue that shifting blame to “our upstream provider” is disingenuous since both GitHub and Azure are within the same parent company.
There’s frustration that GitHub has become less reliable since deeper Azure integration, and doubts that Microsoft leadership treats GitHub’s reliability as a priority.

Cloud capacity, quotas, and the “infinite” myth

Multiple complaints about Azure VM quotas and capacity: multi‑month waits for small quota increases, needing to migrate regions due to lack of hardware, and repeated VM‑ops issues.
Others note AWS has similar capacity and quota problems, just often less visible; instance types and AZ pools can be exhausted.
Discussion highlights that cloud is not actually infinite; it’s still finite hardware with opaque limits and sometimes slow or denied increases.
One thread explains why organizations still choose cloud: compliance, observability, PaaS (managed AD/Entra, SQL, web hosting), and serverless removing ops burden for small teams.

Multi-region and control plane resilience

Criticism that Azure continues to have faults spanning multiple regions, especially in the VM control plane.
Commenters contrast architectural approaches among hyperscalers and note that all share a vulnerability: a control-plane outage can break scaling and lifecycle operations even if running VMs stay up.
For true resilience, some argue you must pre‑allocate capacity and avoid relying on autoscaling—making cloud feel closer to owning hardware.

Alternatives and self-hosting

Suggestions include moving to other forges or at least maintaining a bare mirror to ride out GitHub outages.
GitLab is seen as less appealing after price/plan changes; some praise Codeberg and self‑hosted Forgejo/Gitea as closer to “old GitHub.”
There’s concern about open source projects’ dependence on a single corporate host and what happens if free hosting is reduced or withdrawn.

AI, communication, and status handling

Several jokes blame AI (Copilot, agents) for configuration mishaps and outages, and quip that Copilot being down might improve code quality.
Users complain that GitHub’s status page often lags reality; they use Hacker News as a “sanity check” when jobs silently stall.
Some ask whether paid users will be credited for wasted GitHub Actions minutes during these incidents.

Related topics