What came first: the CNAME or the A record?
DNS fragility and protocol philosophy
- Many see this incident as another example that “it’s always DNS”: small changes expose long‑standing, obscure interoperability bugs.
- Hyrum’s Law is cited: any observable behavior becomes relied upon, regardless of what specs say.
- Debate around Postel’s Law:
- Some argue “be liberal in what you accept” leads to brittle ecosystems and security issues; modern practice favors failing fast on malformed data.
- Others think liberal acceptance is fine if paired with strong warnings and migration paths, though warnings are often ignored in practice.
RFC 1034, ambiguity, and CNAME ordering
- Several commenters argue the RFC text clearly implies CNAMEs must appear first and that “possibly” refers only to presence, not ordering.
- Others think the combination of examples and lack of normative keywords made it reasonable to treat ordering as non‑significant.
- Even if “CNAMEs first” is clear, the RFC is seen as ambiguous about ordering within a CNAME chain; that’s where glibc’s assumptions broke.
Responsibility, testing, and deployment practices
- Strong criticism that a major public resolver changed CNAME ordering without:
- byte‑for‑byte golden tests of responses, and
- integration tests against ubiquitous clients like glibc’s
getaddrinfo.
- Surprise that the failure was only discovered in production; some suggest Cloudflare’s test environments likely used stacks (systemd‑resolved, musl, etc.) that masked the bug.
- Others defend cautious rollout and slow rollback as appropriate for a global service.
Impact on clients (glibc, Cisco, others)
- glibc’s resolver assuming ordered CNAMEs is seen as a serious but long‑hidden bug.
- Cisco switches reboot‑looping on unexpected answers is viewed as especially egregious.
- Some note most other resolvers were tolerant, reinforcing the de facto expectation that servers preserve traditional ordering.
Standards process and de facto behavior
- There’s support for clarifying DNS behavior via an Internet‑Draft, but some dislike Cloudflare’s pattern of “breaking behavior then writing RFCs.”
- Others emphasize that, decades on, the real “spec” is what widely‑deployed software expects, not just old text.
Broader DNS and CNAME issues
- Discussion widens to SERVFAIL semantics, qname minimization, and DNS’s underspecification in edge cases.
- Multiple commenters criticize allowing CNAMEs to coexist with other record types at the same name, and recall earlier Cloudflare features that stretched or violated CNAME rules.