2024-10-30

OpenZFS deduplication is good now and you shouldn't use it

Where ZFS dedup helps vs. where it doesn’t

Strong wins reported for:
- Many similar VMs / templates on shared storage (classic enterprise use; also some home labs).
- Highly duplicated build inputs or archives (build pools, personal “dumping ground” archives, nix store, Flatpak/OSTree-like setups).
- Some users see ~3–8x space savings in these narrow workloads, sometimes making NVMe storage economically viable.
Many commenters confirm that “general purpose” desktop/laptop or mixed file server workloads show little benefit.
Logs and text usually benefit far more from compression than from dedup.

Cost, RAM, and performance concerns

Traditional ZFS inline dedup requires a large in-RAM dedup table; widely cited rule of thumb: up to multiple GB RAM per TB of data.
If the table spills to disk, performance can collapse “to nearly zero.”
Every write/free triggers table lookups and updates, even when there is no duplicate, so random or mostly-unique data pays persistent overhead.
Block-level, fixed-size dedup means partial overlaps or misaligned repeated assets are missed.

Desire for offline / lazy dedup

Several people want “lazy” or scrub-time dedup to avoid write-path penalties.
Others note this would require block pointer rewrite across snapshots, which ZFS’ Merkle-tree design effectively forbids.
Workarounds discussed:
- Separate datasets: write to non-dedup dataset, later move to dedup-enabled one.
- Userspace “offline dedup” with hardlinks or reflinks (rdfind, jdupes, duperemove) once ZFS exposes the right syscalls.
- Planned/desired tools that scan for identical file ranges and convert them to cloned blocks.

Reflinks, block cloning, and alternatives

Many argue modern block cloning / reflinks (BRT, copy_file_range, cp --reflink=auto) provide most of the practical benefit:
- Cheap, instantaneous “copies” when the system knows an operation is a copy (VM templates, file copies, containers, Flatpak).
- No global dedup table; overhead is proportional to actual clones.
Consensus: enable ZFS compression almost everywhere; consider dedup only for very specific, proven-high-duplication workloads.

Enterprise arrays vs. filesystems

Some report 3:1–6:1+ savings with enterprise arrays (Pure, Dell/EMC, Nimble, Windows server dedup).
Others point out:
- Arrays often use smaller blocks, offline or background dedup, and different economics (power, rack space, controller cost).
- Filesystem-level inline dedup is harder to make generally cheap and safe.

Other themes

Security: concern about cross-tenant information leaks via dedupe (timing/side channels), echoing prior RAM-dedup issues.
Snapshots: dedup or clone changes don’t reclaim space until old snapshots referencing blocks are removed.
Encryption: stacking ZFS on dm-crypt/LUKS avoids ZFS’s own encryption quirks but precludes block-level dedup.

Related topics