How we decreased GitLab repo backup times from 48 hours to 41 minutes

Nature of the bug and the fix

  • Backup slowness was traced to a 15-year-old git bundle create function with an O(N²) duplicate-ref check implemented as a nested loop over refs.
  • The fix replaced this with a set/map-based deduplication, turning the critical section into O(N) (or N·logN depending on implementation), yielding a ~6x improvement locally but much larger end-to-end impact for GitLab (48 hours → 41 minutes).
  • Commenters note this is a textbook “use a hash/set instead of a quadratic scan” situation, and that in C it’s common to default to arrays/loops because container usage is more cumbersome.

Debate over “exponential” language

  • The article’s phrase “reducing backup times exponentially” in the same sentence as big-O notation drew heavy criticism.
  • Many argued that in a technical performance post, “exponential” should not be used colloquially; it created confusion and wasted readers’ time trying to locate an actual exponential-time algorithm.
  • Suggested alternatives: “from quadratic to linear,” “dramatically,” “by a factor of n,” or explicitly stating the new big-O.
  • A few defended colloquial use, but most saw it as sloppy or misleading in this context; the author later acknowledged the issue.

Quadratic complexity in practice

  • Multiple anecdotes support the idea that O(N²) is the “sweet spot of bad algorithms”: fast in tests, disastrous at scale (sometimes far beyond quadratic in the wild).
  • Discussion covers when N is small and bounded (e.g., hardware-limited cases) where quadratic or even cubic can be acceptable, versus unbounded N where it’s a latent production time bomb.
  • Some advocate systematically eliminating N² unless N has a hard, low upper bound, and adding explicit safeguards if assumptions about N might change.

C, data structures, and Git implementations

  • Several comments note this as an example that C alone doesn’t guarantee speed; algorithm and data structure choices dominate.
  • Others argue C makes these mistakes likelier because sets/maps aren’t standard and are harder to integrate than in languages with built-in containers.
  • Thread lists alternative Git implementations/libraries (Rust, Go, Java, Haskell, C libraries), with some using them successfully in production.

Backup strategies and alternatives

  • Many question why GitLab relies on git bundle instead of filesystem-level backups (e.g., ZFS/Btrfs snapshots plus send/receive).
  • Defenses: direct filesystem copying can bypass Git’s integrity checking and is tricky across diverse filesystems and self-hosted environments; bundles provide a portable, Git-aware backup format.
  • Offsite replication of ZFS snapshots and snapshot consistency issues (e.g., with Syncthing) are discussed; Git’s lack of a WAL-like mechanism makes naive snapshotting risky in some setups.

Profiling and tooling

  • The flame graph in the article was produced via perf plus FlameGraph tooling; alternatives recommended include pprof, gperftools, and Samply with Firefox’s profiler.
  • Several commenters emphasize that algorithmic changes uncovered via profiling dwarf typical micro-optimizations.

Reactions to GitLab and the write-up

  • Some praise the upstream contribution (landing in Git v2.50.0) and note GitLab now has a dedicated Git team.
  • Others complain about GitLab’s slow UI and the blog post’s style: too long, possibly LLM-like, and missing concrete code snippets, though still useful once skimmed to the core technical section.