Joe Tiedeman

Technology and Security

Zone Files: The Only Authoritative Source and Why They’re Still Hard

When people talk about “total domain coverage”, they’re often talking past one another.

Some mean registration.

Some mean DNS resolution.

Some mean recent activity.

These are related, but they are not the same thing.

A zone file is the registry’s authoritative publication. It shows which domains are delegated in DNS under a top-level domain (TLD) at a specific point in time. In other words: which names are currently active in DNS.

Other systems, notably WHOIS and RDAP, can confirm authoritatively whether a specific domain is registered. They can also confirm what state it is in. They are query-based, rate-limited, and intentionally resistant to bulk discovery.

If the question is:

“Which domains are currently delegated under this TLD?”

Zone files remain the only authoritative, enumerable answer.

Authority doesn’t mean convenience

Zone files are big. Some are very big, the .com zone file for example, is approximately 25GB uncompressed.

Downloading major gTLD zone files involves long-lived HTTPS connections transferring hundreds of megabytes in a single stream. That pushes up against timeouts, proxies, and intermediate infrastructure that were never designed for sustained bulk transfer.

The result is operational variability:

  • Transfers that fail part-way through
  • Connections reset after several minutes
  • Throttling that isn’t always explicit
  • No native support for resumable downloads

In practice, robust consumers design for retries, idempotency, and partial failure. Multiple attempts to retrieve a large zone file are not unusual. Unfortunately, these attempts are, at least in my experience, part of normal operation.

This isn’t a criticism of ICANN’s Centralised Zone Data Service (CZDS). It’s simply the reality of moving large authoritative datasets across the public internet, at global scale, from many different registries.

Designing for failure is not optional

If you’re consuming zone files seriously, for security monitoring, compliance, or research, you learn very quickly that you can’t design your pipeline assuming success.

You design it assuming partial failure, because trust me, it’s going to fail, and possibly often.

That means:

  • Retry logic is a baseline necessity, not a nice-to-have
  • Transient network errors are expected, not exceptional
  • Downloads need to be idempotent and you need to be able to restart/retry at the workflow level
  • Processing needs to tolerate missing or delayed inputs

Zone files are authoritative, but they are not operationally “friendly”. Anyone claiming otherwise probably isn’t actually pulling them every day.

What zone files are and, more importantly, are not

It’s worth being explicit:

Zone files tell you:

  • Which domains are delegated in DNS
  • Which name servers are active
  • What changed between publications

They do not tell you:

  • Who owns a domain
  • Whether a domain is newly registered or simply newly delegated
  • Registration lifecycle states (grace periods, pending delete, etc.)

Those answers live elsewhere, but nowhere else provides a complete, enumerable view of DNS delegation.

Cybaa’s WHOIS/RDAP tool gives you structured registration data, first-seen dates, and expiry insight and we even have a handy API which you can integrate in to your internal tooling!

The tooling is useful for monitoring brand domains, supplier domains, or spotting suspicious newly-registered infrastructure.

In the next post, we’ll look at where even this model breaks down: country-code TLDs, and why global coverage stops at national borders.

Leave a comment