A good day to trie-hard: saving compute 1% at a time
Header handling strategy & risks
- Many are surprised Cloudflare uses a “list of header names that are internal” instead of structural separation or strict prefixes.
- Concerns raised: name collisions with user headers, inconsistent lists across services, sanitization bugs, and issues with
Connectionheader semantics. - Some argue this pattern is common in large enterprises and edge proxies; others say that doesn’t make it less fragile.
- Several suggest prefixing all internal headers (
CFInt,X-CF-) and stripping by prefix, but others note legacy systems, early headers, third‑party appliances, and acquisitions make global renaming hard. - It’s stated that a longer‑term plan is to stop using HTTP headers for internal IPC entirely; some worry that makes the trie work a short‑lived stopgap.
Alternative designs proposed
- Separate metadata channel: dedicated internal protocol (e.g., Protobufs or custom encapsulation) instead of overloading HTTP headers.
- Structural approaches: maintain a list of “allowed to exit” headers instead of “internal to strip” (deny‑by‑default), or record original inbound headers and only emit those.
- Data‑structure tweaks: force internal headers to the front and remove first N; use a header-count sentinel; or tag headers as internal at creation rather than inferring later.
Tries vs hashes vs regex
- Discussion on why a trie beats hash tables here: hashing strings requires touching every byte; tries often reject on the first character, and most lookups are misses.
- Alternatives floated: custom fast hash functions, perfect hashing, Bloom/binary-fuse filters, hardware CRC32, or specialized hash maps; others point out these still need substantial hashing work.
- Regex/Aho‑Corasick and DFAs are mentioned as conceptually similar; regex libraries carry general-purpose overhead, DFAs can be faster but use more memory and build time.
- Some critique the article’s Big‑O characterizations, arguing they blur key factors like cache behavior versus comparison counts.
Performance impact & ROI debate
- The function optimized is in an extremely hot path, and small per‑request wins aggregate across tens of millions of requests per second.
- Some see saving hundreds of cores as modest vs overall fleet and question engineering ROI compared to tackling larger architectural issues.
- Others counter that recurring CPU, power, and capacity savings, plus improved headroom, justify micro‑optimizations in hot code, and that the write‑up also has marketing and educational value.