Incident with multple GitHub services

Perceived decline in GitHub reliability

  • Many commenters say frequent outages now feel like “normal mode” for GitHub.
  • Some suggest jokingly that alerts should trigger when GitHub is up, not down.
  • Several call this level of reliability “embarrassing” for a mature, central developer platform.

Uptime metrics and status-page skepticism

  • Third‑party aggregation of GitHub’s own status data shows combined uptime around ~88%, with core git operations around ~99% (“one nine”).
  • Commenters note GitHub’s status page now avoids aggregate numbers and instead shows many green sub-services, which can mask the real impact.
  • Breaking services into many components is seen by some as a way to make reliability appear better than it is.
  • Confusion over “green” days that still show multiple incidents leads to accusations of the status page being misleading.

Specific incidents and impact

  • In addition to outages and slow/missed Actions, a merge queue regression silently merged malformed commits and effectively reverted multiple PRs on default branches.
  • Several teams received post‑hoc PDFs listing affected commits and remediation steps; this is viewed as far worse than simple downtime.

Alternatives and self-hosting

  • Many report moving or experimenting with GitLab, Gitea, Forgejo, Codeberg, Sourcehut, and others.
  • Self-hosted Forgejo/Gitea on modest hardware is frequently praised for speed, uptime, and cost (especially for CI runners).
  • Some still mirror to GitHub for visibility and recruiting, treating it as a public front for a privately hosted forge.

Homelab practices and trade-offs

  • Detailed homelab setups (Proxmox, NixOS, containers, runners for multiple OSes, backups via Backblaze/Hetzner, etc.) are described.
  • Others argue for radical simplicity (single box, minimal services) and warn against overcomplicated “alphabet soup” stacks.
  • NixOS is highlighted as making long‑lived homelabs easier via declarative, self‑documenting configs and rollbacks.

AI usage, scale, and Azure

  • Some blame instability on Microsoft’s stewardship, layoffs, Azure migration priorities, and “AI slop.”
  • Others point to massive growth in commits and GitHub Actions minutes, driven partly by AI bots and “vibe coders,” stressing the platform’s scale.
  • Exact causality between Azure, AI usage, and outages is debated and ultimately unclear.

Lock-in, prestige, and business impact

  • GitHub’s network effects and prestige for hiring are seen as strong lock‑in, especially for corporate users.
  • Some believe companies will only move when GitHub loses status, not over reliability alone.
  • There’s discussion of whether GitHub is actually losing business; many think current outages are largely written off as “cost of doing business.”