Ask HN: How did the internet discover my subdomain?

Primary ways subdomains get “discovered”

  • Certificate Transparency (CT) logs expose any hostname with its own public TLS cert; many tools and services continuously tail these logs.
  • Large-scale IPv4 scanning (e.g., by security companies) hits every routable IP and probes common ports, then fingerprints what’s running.
  • DNS-based techniques: brute-force enumeration with wordlists, AXFR (zone transfers) on misconfigured DNS servers, and DNSSEC/NSEC zone walking.
  • Reverse lookups via TLS: connect to an IP over HTTPS, inspect the certificate/SNI to learn associated hostnames.

DNS, passive data, and commercial services

  • DNS zones are not generally enumerable, but:
    • Some authoritative servers still allow unauthenticated AXFR (misconfiguration but common enough to mine).
    • DNSSEC NSEC/NSEC3 can leak zone structure unless carefully configured.
    • “Passive DNS” providers and some ISPs/resolvers sell aggregated query/answer logs, revealing which hostnames are being resolved.
    • PTR (reverse DNS) records can map IPs back to hostnames.
  • Many subdomain-finding tools aggregate CT, passive DNS, zone-transfer leaks, brute-forced records, and web crawling into searchable databases.

IP scanning and default virtual hosts

  • If a scanner connects by raw IP (no SNI/Host header), it often hits the web server’s default vhost; those requests may be logged under a particular subdomain, creating the impression the subdomain itself was targeted.
  • With non-SNI TLS or a default cert, the hostname in that cert can be learned even without knowing the domain first.

Telemetry and browser/endpoint leaks

  • Browser telemetry, corporate firewalls, antivirus, and URL-filtering appliances can observe domains users visit and feed them into security/crawling ecosystems.
  • Email/webmail (e.g., links in Gmail), Chrome/Edge browsing, and similar channels can surface otherwise “unlisted” URLs.

Security through obscurity and mitigations

  • Consensus: obscurity (unguessable subdomains) can reduce noise and attack surface but must not be the only control.
  • Suggested mitigations: authentication, IP allowlisting, firewalling origin to Cloudflare only, wildcard certs to reduce CT leakage, or hiding sensitive services behind hard-to-guess paths rather than hostnames.