DNSSEC disruption affecting .de domains – Resolved

Immediate impact

  • Many users reported most .de domains unreachable (including major sites and personal blogs), while some still worked due to caching.
  • Both DNSSEC-signed and unsigned .de domains failed for users whose resolvers validate DNSSEC; non-validating resolvers or cached entries often still worked.
  • Email and ancillary services (e.g., monitoring, CSS from .de CDNs) were also affected.
  • Outage began in the evening local time, which commenters felt limited business impact but still caused substantial stress for operators.

Technical diagnosis

  • Consensus that this was a DNSSEC failure at DENIC, not a basic nameserver or connectivity outage.
  • Validating resolvers returned SERVFAIL with extended error data pointing to malformed RRSIG over an NSEC3 record in the .de zone.
  • Zone contents (A/NS/SOA etc.) appeared intact; the broken signature on NSEC3 under ZSK keytag 33834 invalidated the chain of trust.
  • Anycast meant some .de nameservers still served older, valid signatures, causing intermittent success.
  • Several posts framed it as a botched ZSK rollover or re-signing during planned maintenance.

Operational responses & workarounds

  • People temporarily:
    • Switched resolvers (e.g., to those with useful caches).
    • Disabled DNSSEC validation locally (e.g., domain-insecure: "de" in Unbound).
  • Cloudflare temporarily disabled DNSSEC validation for .de on 1.1.1.1 to restore reachability.
  • DENIC status page initially wobbled (even unreachable for some) but later confirmed a disruption of DNSSEC-signed .de domains and then resolution; root cause still “under investigation.”

Reliability, DNSSEC, and centralization debates

  • Strong criticism of DNSSEC’s operational brittleness: a single signing error at a TLD effectively removed a major country-code domain from the validating internet.
  • Others argued DNS was always hierarchically centralized at TLDs; DNSSEC mainly adds integrity and actually supports more secure decentralization via validating caches.
  • Several noted DNSSEC’s low real-world adoption and questioned risk/benefit, especially when big sites and banks are often unsigned.
  • Comparisons drawn to TLS and Let’s Encrypt: TLS outages would leak data, whereas DNSSEC failures “fail closed” but hurt availability.

Operational practices & disaster recovery

  • Discussion on:
    • TTL strategies (high normally, lowered before planned changes).
    • Using diverse ASes/providers for authoritative DNS.
    • Need for robust disaster-recovery and cold-start plans for critical internet infrastructure.
    • Suggestions that DNSSEC operations merit formal methods and more automation/testing.

Humor and cultural commentary

  • Thread filled with German in-jokes (Feiertag for .de, “Danke” memes), speculation about late-night maintenance after a party, and reflections on past TLD outages.