2025-03-06

Ask HN: How did the internet discover my subdomain?

Primary ways subdomains get “discovered”

Certificate Transparency (CT) logs expose any hostname with its own public TLS cert; many tools and services continuously tail these logs.
Large-scale IPv4 scanning (e.g., by security companies) hits every routable IP and probes common ports, then fingerprints what’s running.
DNS-based techniques: brute-force enumeration with wordlists, AXFR (zone transfers) on misconfigured DNS servers, and DNSSEC/NSEC zone walking.
Reverse lookups via TLS: connect to an IP over HTTPS, inspect the certificate/SNI to learn associated hostnames.

DNS, passive data, and commercial services

DNS zones are not generally enumerable, but:
- Some authoritative servers still allow unauthenticated AXFR (misconfiguration but common enough to mine).
- DNSSEC NSEC/NSEC3 can leak zone structure unless carefully configured.
- “Passive DNS” providers and some ISPs/resolvers sell aggregated query/answer logs, revealing which hostnames are being resolved.
- PTR (reverse DNS) records can map IPs back to hostnames.
Many subdomain-finding tools aggregate CT, passive DNS, zone-transfer leaks, brute-forced records, and web crawling into searchable databases.

IP scanning and default virtual hosts

If a scanner connects by raw IP (no SNI/Host header), it often hits the web server’s default vhost; those requests may be logged under a particular subdomain, creating the impression the subdomain itself was targeted.
With non-SNI TLS or a default cert, the hostname in that cert can be learned even without knowing the domain first.

Telemetry and browser/endpoint leaks

Browser telemetry, corporate firewalls, antivirus, and URL-filtering appliances can observe domains users visit and feed them into security/crawling ecosystems.
Email/webmail (e.g., links in Gmail), Chrome/Edge browsing, and similar channels can surface otherwise “unlisted” URLs.

Security through obscurity and mitigations

Consensus: obscurity (unguessable subdomains) can reduce noise and attack surface but must not be the only control.
Suggested mitigations: authentication, IP allowlisting, firewalling origin to Cloudflare only, wildcard certs to reduce CT leakage, or hiding sensitive services behind hard-to-guess paths rather than hostnames.

Related topics