Make Ubuntu packages 90% faster by rebuilding them
Claimed speedups and how they’re measured
- The gist’s “90% faster” claim is debated.
- Some point out it’s actually ~1.9× throughput (i.e. ~45% less time), not a 90% reduction in runtime.
- Others note this ambiguity is common: “X% faster” is interpreted variously as time reduction vs rate increase.
- A commenter recomputes the combined impact of better compiler flags and mimalloc, showing the math matches the reported ~1.9× gain.
Compiler flags, CPU tuning, and -O3
- Rebuilding jq with
-O3, LTO, and-march=native(or higher x86-64 levels) yields noticeable speedups over Ubuntu’s conservative builds. - Some highlight Ubuntu still targets x86-64-v1 for broad compatibility; distros that build for x86-64-v3 see large wins on modern CPUs.
- There’s disagreement about
-O3:- One camp calls it risky (more exposure to UB, rare compiler bugs).
- Others counter this is mostly folklore; the UB is in the code, not created by
-O3, and large projects routinely ship with high optimizations.
Memory allocators: glibc vs mimalloc/jemalloc
- A key finding: swapping glibc
mallocfor mimalloc accounts for a large part of the gain; preloading mimalloc alone gives ~44% speedup. - Many claim “everything outperforms glibc malloc”; it’s seen as slow, with poor multi-thread behavior and fragmentation issues (e.g. per-thread arenas that never return memory).
- Others stress engineering tradeoffs:
- glibc is a conservative “reliable generalist”; alternative allocators optimize for throughput, concurrency, or fragmentation differently.
- Long-lived, allocation-heavy services can hit subtle pathologies with any allocator; some report glibc was actually the most stable in specific video/editing workloads.
- There’s caution about mixing allocators (e.g. one for app, another in a library), which can cause crashes.
Security and maintenance
- Rebuilding outside the distro means losing automatic security updates, notably for dependencies like oniguruma; this is a real concern for jq parsing untrusted JSON.
- A side thread debates whether changing allocator/flags might “help” security via layout changes; this morphs into an extended argument about ASLR and “security through obscurity”, with no consensus.
Scope, generality, and micro-benchmark concerns
- Multiple commenters stress this result is for one tool (jq) on one workload; it’s not “Ubuntu packages” in general.
- Some argue this is essentially the Gentoo model: recompiling for your CPU and preferences can help, but:
- Build times, complexity, and potential new bugs limit broad applicability.
- Distros could instead provide prebuilt variants (e.g. x86-64-v3, tuned builds for a few hot packages).
Alternatives to “just make jq faster”
- Some suggest avoiding repeated heavy jq runs by:
- Converting large GeoJSON to more efficient formats (Parquet, GeoParquet, GeoPackage, FlatGeobuf, DuckDB/ClickHouse) and querying there, often orders of magnitude faster.
- Others recommend jq-like tools implemented differently:
- jaq (Rust jq clone) and jj are cited; jaq can be faster on some workloads but currently lacks full jq feature parity (e.g.
strftime).
- jaq (Rust jq clone) and jj are cited; jaq can be faster on some workloads but currently lacks full jq feature parity (e.g.
- A Go example parsing ~1.3GB GeoJSON with the standard JSON library shows competitive or better performance than the jq timings.
Distros, tools, and “right way” to rebuild
- Gentoo, Arch (with ALHP), Guix, and Clear Linux are mentioned as ecosystems where tuning flags, allocators, and CPU targets is first-class.
- Several suggest using distro-native mechanisms (e.g.
apt-get source jqand rebuilding, or Gentoo/Guix transforms) to preserve integration and security updates rather than ad-hoc local builds.
jq itself
- Separate mini-thread: some find jq’s point-free syntax initially opaque; others share resources explaining it and note that once understood, it’s powerful.