My failed attempt to shrink all NPM packages by 5%
RFC process and decision dynamics
- Several commenters emphasize that the RFC was not explicitly rejected but effectively “soft‑rejected” by demanding more user‑impact evidence than a volunteer could reasonably supply.
- Others argue the npm team’s request for concrete user benefit before changing every package publish is entirely reasonable.
- There’s tension between expectations for an open‑source project (be grateful for free improvements) and a critical infrastructure service (opt for extreme conservatism around core behavior).
- Some think closing the RFC while saying “this warrants further discussion” practically guarantees the discussion dies; others say long‑term design work shouldn’t live in RFCs alone.
Cost–benefit of a 5% reduction
- One camp views 5% bandwidth savings as significant at npm scale (4.5–5 PB/week → ~225–250 TB/week saved, potentially tens of thousands of dollars/year plus lower CO₂).
- Another camp sees this as marginal relative to overall costs, and not worth extra complexity or risk, especially if the people paying the bandwidth bill aren’t pushing for it.
- There’s a broader philosophical debate: many tiny 2–5% wins aggregate into large systemic speedups vs. “if it ain’t broke, don’t fix it” for foundational tooling.
Zopfli’s performance and tradeoffs
- Reported real‑world numbers: gzip/tar ~1.2 seconds vs Zopfli ~2.5 minutes for a big package like TypeScript; other benchmarks show Zopfli 28–2700× slower than optimized gzip (pigz+zlib‑ng) for only ~5–7% size savings.
- Compress‑once / download‑many argument: even extreme slowdown on publish might amortize well over millions of downloads; critics counter that CI pipelines and frequent builds would also pay the price.
- Multiple people stress decompression cost is essentially unchanged (same DEFLATE decoder, smaller stream), so client runtime and energy use should be neutral or slightly better.
Alternatives and rollout ideas
- Alternatives raised: Brotli, zstd, bzip2/xz, shared dictionaries, or switching to uncompressed tar + HTTP‑level content negotiation.
- Various rollout strategies are suggested:
- Only apply stronger compression to very popular or very large packages (top 50–5000 by traffic, or >X GiB/week).
- New format (e.g., zstd) published alongside gzip; new clients prefer it, old ones keep using gzip.
- “Strong compression” as an opt‑in flag for release builds or CI, not the default.
- Backend or proxy recompression of hot packages, though tarball hashing and lockfile checksums complicate this.
Developer experience and CI impact
- Many consider massively increased publish time (seconds → minutes) unacceptable, especially when package creation happens on every CI build or test run.
- Others argue publishing is relatively infrequent, and release‑only compression would limit the pain.
- Concerns about needing a native or WASM binding to Zopfli are seen by some as genuine maintenance risk, by others as overblown given Node already depends on large native components.
Checksums, compatibility, and security
- Recompressing existing tarballs in place breaks recorded checksums in package‑lock files and other tooling; some propose hashing uncompressed tar contents instead.
- Hashing compressed blobs is defended as reducing attack surface (decompressors only see verified data; decompression bugs are a known vector).
- Suggestions for proxies that recompress and re‑sign packages prompt questions around integrity, zipbomb‑style attacks, and where trust boundaries should sit.
NPM ecosystem bloat and better wins
- Many argue bigger savings likely lie in:
- Stripping cruft from packages (tests, docs, binaries, platform‑specific fallbacks) via
.npmignoreorfilesallowlists. - Using pnpm‑style shared stores / symlinks or Yarn PnP to avoid duplicating dependencies across projects.
- Organizational package proxies and caching, reducing repeated network fetches in CI.
- Stripping cruft from packages (tests, docs, binaries, platform‑specific fallbacks) via
- Some see compression tweaks as a band‑aid over deeper structural inefficiencies in how node_modules and registries are managed.
Broader themes: optimization culture and governance
- Several comments generalize this story to:
- The difficulty of making “small but global” changes in mature ecosystems without over‑ or under‑reacting to risk.
- How large open‑source projects increasingly behave like cautious enterprises, which can discourage volunteers working on low‑visibility optimizations.
- Disagreement on whether such 5% wins are essential engineering discipline or a distraction from user‑visible problems.