We shrunk our Javascript monorepo git size

Git delta/compression issue & fixes

  • Discussion centers on Git grouping files for delta compression using a hash of the last 16 bytes of the path, not just filename.
  • In large JS monorepos with many similarly named paths (e.g., numerous .../CHANGELOG.*), different files collide into the same hash bucket.
  • This leads Git to compute deltas between unrelated files, blowing up pack size when those files are large and frequently changed.
  • New options in a Microsoft Git fork (--full-name-hash, later superseded by a path-walk API / --path-walk) address this by using full paths and better grouping.
  • Commenters emphasize that simply increasing window size is a workaround with huge memory cost; proper path-based grouping is the real fix.

Effects on other repos & local optimization

  • Users report large real-world gains running aggressive git repack with bigger windows and/or path-walk, shrinking multi‑GB repos by more than half.
  • Concern raised that GitHub and other hosts may not run such heavy repacks routinely, so remote clones can remain inflated even if local clones are optimized.

Git history cleanup & binaries

  • Thread briefly revisits classic advice for large blobs: remove binary history with tools like filter-branch, BFG, or git-filter-repo, and use Git LFS for binaries.
  • Distinction made between single large binaries (less harmful) and small but frequently changing binaries (much worse for repo size).

Monorepos vs many repos

  • Some argue this is an avoidable, self‑inflicted monorepo problem (thousands of packages in one repo).
  • Others counter that multiple repos introduce painful cross‑repo versioning and atomic-change issues; tooling complexity is the lesser evil.
  • Examples from large organizations are cited on both sides, with no clear consensus.

Azure DevOps, Teams, and internal tooling

  • Surprise that Azure DevOps is heavily used internally; several detailed complaints about its UX, reliability, security limitations, and slower feature development vs GitHub.
  • Mixed views on Office web apps and Teams: some praise complex cross‑platform collaboration; others report severe bugs, performance issues, and configuration fragility.

Network, “Europe” remark & cloning

  • The “folks in Europe can’t clone” line sparks debate.
  • Many note European consumer connections are often very fast; problems are more about transatlantic latency, flaky VPN/corporate networks, or packet loss than raw bandwidth.
  • Some personal anecdotes describe huge repos timing out over high-latency or unreliable links.

Article style, wording, and tone

  • Several readers find the gifs and “color” distracting and wish for a clearer technical explanation.
  • Others help reconstruct and clarify the technical details via linked PRs and cover letters.
  • Thread includes side discussions and jokes about the title wording (“shrank/shrunk/shrunken/shrunked”) and other lighthearted quips.