Make Ubuntu packages 90% faster by rebuilding them

Claimed speedups and how they’re measured

  • The gist’s “90% faster” claim is debated.
    • Some point out it’s actually ~1.9× throughput (i.e. ~45% less time), not a 90% reduction in runtime.
    • Others note this ambiguity is common: “X% faster” is interpreted variously as time reduction vs rate increase.
  • A commenter recomputes the combined impact of better compiler flags and mimalloc, showing the math matches the reported ~1.9× gain.

Compiler flags, CPU tuning, and -O3

  • Rebuilding jq with -O3, LTO, and -march=native (or higher x86-64 levels) yields noticeable speedups over Ubuntu’s conservative builds.
  • Some highlight Ubuntu still targets x86-64-v1 for broad compatibility; distros that build for x86-64-v3 see large wins on modern CPUs.
  • There’s disagreement about -O3:
    • One camp calls it risky (more exposure to UB, rare compiler bugs).
    • Others counter this is mostly folklore; the UB is in the code, not created by -O3, and large projects routinely ship with high optimizations.

Memory allocators: glibc vs mimalloc/jemalloc

  • A key finding: swapping glibc malloc for mimalloc accounts for a large part of the gain; preloading mimalloc alone gives ~44% speedup.
  • Many claim “everything outperforms glibc malloc”; it’s seen as slow, with poor multi-thread behavior and fragmentation issues (e.g. per-thread arenas that never return memory).
  • Others stress engineering tradeoffs:
    • glibc is a conservative “reliable generalist”; alternative allocators optimize for throughput, concurrency, or fragmentation differently.
    • Long-lived, allocation-heavy services can hit subtle pathologies with any allocator; some report glibc was actually the most stable in specific video/editing workloads.
  • There’s caution about mixing allocators (e.g. one for app, another in a library), which can cause crashes.

Security and maintenance

  • Rebuilding outside the distro means losing automatic security updates, notably for dependencies like oniguruma; this is a real concern for jq parsing untrusted JSON.
  • A side thread debates whether changing allocator/flags might “help” security via layout changes; this morphs into an extended argument about ASLR and “security through obscurity”, with no consensus.

Scope, generality, and micro-benchmark concerns

  • Multiple commenters stress this result is for one tool (jq) on one workload; it’s not “Ubuntu packages” in general.
  • Some argue this is essentially the Gentoo model: recompiling for your CPU and preferences can help, but:
    • Build times, complexity, and potential new bugs limit broad applicability.
    • Distros could instead provide prebuilt variants (e.g. x86-64-v3, tuned builds for a few hot packages).

Alternatives to “just make jq faster”

  • Some suggest avoiding repeated heavy jq runs by:
    • Converting large GeoJSON to more efficient formats (Parquet, GeoParquet, GeoPackage, FlatGeobuf, DuckDB/ClickHouse) and querying there, often orders of magnitude faster.
  • Others recommend jq-like tools implemented differently:
    • jaq (Rust jq clone) and jj are cited; jaq can be faster on some workloads but currently lacks full jq feature parity (e.g. strftime).
  • A Go example parsing ~1.3GB GeoJSON with the standard JSON library shows competitive or better performance than the jq timings.

Distros, tools, and “right way” to rebuild

  • Gentoo, Arch (with ALHP), Guix, and Clear Linux are mentioned as ecosystems where tuning flags, allocators, and CPU targets is first-class.
  • Several suggest using distro-native mechanisms (e.g. apt-get source jq and rebuilding, or Gentoo/Guix transforms) to preserve integration and security updates rather than ad-hoc local builds.

jq itself

  • Separate mini-thread: some find jq’s point-free syntax initially opaque; others share resources explaining it and note that once understood, it’s powerful.