We shrunk our Javascript monorepo git size
Git delta/compression issue & fixes
- Discussion centers on Git grouping files for delta compression using a hash of the last 16 bytes of the path, not just filename.
- In large JS monorepos with many similarly named paths (e.g., numerous
.../CHANGELOG.*), different files collide into the same hash bucket. - This leads Git to compute deltas between unrelated files, blowing up pack size when those files are large and frequently changed.
- New options in a Microsoft Git fork (
--full-name-hash, later superseded by a path-walk API /--path-walk) address this by using full paths and better grouping. - Commenters emphasize that simply increasing window size is a workaround with huge memory cost; proper path-based grouping is the real fix.
Effects on other repos & local optimization
- Users report large real-world gains running aggressive
git repackwith bigger windows and/or path-walk, shrinking multi‑GB repos by more than half. - Concern raised that GitHub and other hosts may not run such heavy repacks routinely, so remote clones can remain inflated even if local clones are optimized.
Git history cleanup & binaries
- Thread briefly revisits classic advice for large blobs: remove binary history with tools like filter-branch, BFG, or
git-filter-repo, and use Git LFS for binaries. - Distinction made between single large binaries (less harmful) and small but frequently changing binaries (much worse for repo size).
Monorepos vs many repos
- Some argue this is an avoidable, self‑inflicted monorepo problem (thousands of packages in one repo).
- Others counter that multiple repos introduce painful cross‑repo versioning and atomic-change issues; tooling complexity is the lesser evil.
- Examples from large organizations are cited on both sides, with no clear consensus.
Azure DevOps, Teams, and internal tooling
- Surprise that Azure DevOps is heavily used internally; several detailed complaints about its UX, reliability, security limitations, and slower feature development vs GitHub.
- Mixed views on Office web apps and Teams: some praise complex cross‑platform collaboration; others report severe bugs, performance issues, and configuration fragility.
Network, “Europe” remark & cloning
- The “folks in Europe can’t clone” line sparks debate.
- Many note European consumer connections are often very fast; problems are more about transatlantic latency, flaky VPN/corporate networks, or packet loss than raw bandwidth.
- Some personal anecdotes describe huge repos timing out over high-latency or unreliable links.
Article style, wording, and tone
- Several readers find the gifs and “color” distracting and wish for a clearer technical explanation.
- Others help reconstruct and clarify the technical details via linked PRs and cover letters.
- Thread includes side discussions and jokes about the title wording (“shrank/shrunk/shrunken/shrunked”) and other lighthearted quips.