ISO PDF spec is getting Brotli – ~20 % smaller documents with no quality loss
Origin and Motivation of Brotli-in-PDF
- One view: a commercial vendor is “paying to legalize” an SDK incompatible with existing readers, enabled by ISO’s “pay to play” structure.
- Counterpoint: others state the feature originates from the PDF Association technical working group, not a specific vendor, and that open-source engines (MuPDF, Ghostscript) have added experimental support to aid interoperability testing.
- Some readers find the article’s tone and slogans (“files ahead of their time”) salesy or AI-generated, which increases skepticism.
Choice of Brotli vs Alternatives (zstd, gzip, xz, lzma2)
- Many argue zstd would be a better fit for a read-mostly format: similar or slightly worse compression than Brotli, but much faster decompression.
- Benchmarks in the thread show:
- Brotli typically compresses PDFs a bit better (≈1–4% smaller) than zstd at their highest levels.
- zstd decompresses substantially faster; gzip is fastest but with worse ratios; xz/lzma2 compress well but decompress very slowly.
- Some commenters claim zstd is “Pareto better,” others correct this as untrue at maximum-compression settings.
- A few see Brotli’s Google origin as a soft political factor; others dismiss conspiracy framing and note both Brotli and zstd are now widely available.
Custom Dictionaries and Long-Term Design
- Question raised: why use Brotli’s built‑in web‑corpus dictionary (with 2015 HTML/swear-word biases) for PDFs at all?
- Concerns:
- Symbol statistics in PDFs differ from web pages.
- Baking that dictionary into a long‑lived archival format ties future readers to a dated corpus.
- PDF Association is reportedly still experimenting with custom dictionaries; one commenter expects only modest extra gains (~1%) except in very small or per-page-restarted streams.
Backward Compatibility and Deployment Risk
- Strong criticism that this is a breaking change: older readers supporting only Deflate cannot open Brotli-compressed PDFs.
- This is seen as contradicting the PDF Association’s stated principle that new features must “work seamlessly with existing readers.”
- Some note that many devices and embedded viewers cannot be updated, eroding one of PDF’s core strengths (reliable universal readability).
- Several argue that saving ~20% file size is not worth years of fragmented compatibility and that tools should wait until Brotli support is ubiquitous in major renderers.
Compression-in-Format vs Transport/Filesystem
- Some question the point of embedding a general-purpose compressor when:
- Filesystems can already compress (often with zstd/lz4).
- HTTP can already apply Brotli/zstd via
Content-Encoding.
- Others reply that in-PDF filters allow:
- Different methods per stream (e.g., JPEG for images, Brotli for text).
- Page-level access without decompressing the entire file.
Broader Reflections on PDF Evolution
- Several see this as another instance of PDF accreting complexity (like XFA, JavaScript), undermining the “always opens the same” promise.
- Others note that PDF has always evolved via versioned, sometimes breaking changes, albeit slowly and conservatively.
- Some would prefer more impactful “breaking” additions, such as native JPEG XL image support, since images often dominate PDF size.