Note that there are much better DNS replication protocols. It is, of course, up to the administrator to decide on the location of his DNS servers, the software used for those servers, the mechanisms for replicating data among those servers, etc.; there is no requirement that sites replicate their data with AXFR.
AXFR is also sometimes used by unauthorized third parties who want to sneak a peek at a site's data. Many years ago, these peeks were practically always successful, because almost all sites had promiscuous AXFR servers; these days, however, promiscuous AXFR servers are widely discouraged and increasingly uncommon.
(From a snoop's perspective, the difference between AXFR and normal queries is that normal queries force the snoop to guess the relevant domain names, while AXFR reveals the domain names for free. The notion that DNS data is entirely public does not match the reality of private high-entropy domain names at many sites.)
A few DNS parent administrators refuse to delegate names to servers that do not provide data through AXFR to the parent computers. This is bad practice for several reasons, but it nevertheless occurs, and it means that names managed by those administrators are available only to sites that run AXFR servers.
The story is actually more complicated than this, for several reasons:
A TCP-SOA AXFR client, such as named-xfer or axfr-get or dig axfr, actually works as follows. It connects to an AXFR server on TCP port 53. It may then send an SOA request and receive a response. It may then send an AXFR request and receive an AXFR response. It then closes the connection.
A UDP-SOA AXFR client, such as the BIND 9 AXFR client, works as follows. It may send an SOA request to a DNS server on UDP port 53 and receive a response. It may then connect to an AXFR server on TCP port 53 at the same IP address, send an AXFR request, receive an AXFR response, and close the connection.
An AXFR server is prepared for both types of AXFR clients, and for non-AXFR DNS clients requesting data that did not fit through UDP. After the server accepts a connection, it
In theory, the AXFR server's procedure allows a client to send any number of AXFR requests and other requests in a single connection. In practice, current clients close the connection after the first request, except in the case of an SOA request followed by an AXFR request from a TCP-SOA AXFR client. Separate requests are handled in separate connections.
These tests are the source of the rule ``The SOA serial number must increase whenever any other data changes.'' If the zone has changed, but the serial number in the SOA response is not increased (by an amount between 1 and 2^31-1 inclusive, modulo 2^32), then AXFR clients may never see the new zone data.
One of the flaws in the AXFR protocol is that it's actually impossible for servers to follow this rule under all circumstances. AXFR clients will sometimes fail to pick up changes in a zone.
For example, suppose a BIND 9 AXFR client receives a zone through AXFR, and then checks for changes later. Suppose that there have been between 2^31 and 2^32-1 changes in the meantime. Suppose that the AXFR server uses the strategy of increasing the serial number by exactly 1 for each change. Result: The AXFR client will skip the AXFR request. The AXFR client won't receive the new data. Similar comments apply to other serial-number strategies.
Sites that do not rely on AXFR can ignore serial numbers in favor of mechanisms that actually ensure accurate replication, such as the cryptographically strong checksums in rsync.
AXFR clients are prepared for incomplete AXFR responses, and throw them away if they occur. For example, if the b.ns.yale.edu DNS server receives only part of the yale.edu data from the a.ns.yale.edu DNS server, it throws that data away, and tries again later.
Similarly, AXFR servers are prepared for sudden connection failures.
AXFR clients and AXFR servers, like other network programs, impose time limits on each network operation. For example, axfr-get aborts its AXFR attempt if it does not receive any network data for 60 seconds.
Many servers simply extract the query ID and zone name, stop parsing bytes after the \000\374\000\001, and ignore the possibility of extensions. It is the responsibility of extended AXFR clients to preserve compatibility with unextended AXFR servers.
An AXFR rejection can be distinguished from zone data in several ways. For example:
Server implementations vary in their preferred methods of indicating errors. Here are some issues for implementors to consider:
Sometimes a server will start sending zone data successfully, but then encounter an error (such as a disk failure) before completing the AXFR response. In this case, closing the connection is by far the safest strategy.
An AXFR server sends a zone by sending a series of DNS packets containing all the records in the zone, in the format shown below. Note that an AXFR response can, and almost always does, include more than one DNS packet, unlike a normal DNS response. AXFR clients parse the DNS packets to determine the end of the response.
It is the responsibility of the AXFR server to ensure that a zone is retrieved atomically. If the zone changes during an AXFR response, the AXFR server must finish sending the original zone (concluding with the original SOA with the original serial number), or abort the transfer (which may mean that the zone is never successfully transferred, if changes are frequent).
Here is the format for each DNS packet:
axfrdns never includes the question. BIND 9 includes the question in the first packet but not in subsequent packets. The BIND company's ``AXFR clarifications'' tell implementors to use the BIND 9 strategy, but this has no benefits; it is certainly not necessary for interoperability.
axfr-get reads records through the end of the packet. BIND 9 reads only the records in the answer section. Both of these parsing mechanisms work properly, because the authority section and the additional section are empty. The BIND company's ``AXFR clarifications'' demand that implementors use the BIND 9 parsing mechanism, but this has no benefits; it is certainly not necessary for interoperability.
WARNING: A huge number of AXFR client installations do not support more than one answer record per packet. Consequently, AXFR servers must send one answer record per packet. (On the bright side, this is also the easiest strategy to implement.)
The BIND company, demonstrating an astonishing disregard for interoperability, changed BIND to start sending multiple answer records per packet by default, even though they were perfectly aware that this breaks many deployed clients. Consequently, AXFR clients must accept multiple answer records per packet.
(The BIND company's excuse, namely bandwidth, displays an equally astonishing lack of perspective. AXFR traffic is a tiny part of DNS traffic, and DNS traffic is a tiny part of total web traffic. In the unlikely event that there's any site that actually cares about AXFR bandwidth: You should be using gzipped rsync, not this primitive AXFR protocol.)
@ 1D IN SOA A.ROOT-SERVERS.NET NSTLD.VERISIGN-GRS.COM ... 6D IN NS A.ROOT-SERVERS.NET A.ROOT-SERVERS.NET 5w6d16h IN A 198.41.0.4 ... mil 1D IN NS A.ROOT-SERVERS.NET A.ROOT-SERVERS.NET 5w6d16h IN A 198.41.0.4 ... @ 1D IN SOA A.ROOT-SERVERS.NET NSTLD.VERISIGN-GRS.COM ...
AXFR clients remove all repeated records in zones received through AXFR. (Note to Bert: Sorting times essentially linear time, not quadratic time.)
The BIND company's ``AXFR clarifications'' tell AXFR servers to avoid repeating any records (other than the SOA). However, this is not necessary for interoperability. The BIND company's bandwidth excuse is frivolous, as discussed above.
Suppose the ISP pulls the princeton.edu zone from a Princeton AXFR server, and receives the data
haven.princeton.edu NS serv1.net.yale.edu serv1.net.yale.edu A 128.112.128.15from the AXFR server. The point of the following analysis is that the ISP must discard the yale.edu information.
An innocent user's cache has the legitimate record
yale.edu NS serv1.net.yale.edusaved. The user asks for the address of www.haven.princeton.edu. The cache contacts the root server, learns that the ISP is a .edu server, contacts the ISP, and receives the same information
haven.princeton.edu NS serv1.net.yale.edu serv1.net.yale.edu A 128.112.128.15that the ISP obtained from the Princeton server. Because the ISP is a .edu server, the cache trusts and saves the serv1.net.yale.edu information. The user now asks for the address of www.yale.edu. The cache knows yale.edu NS serv1.net.yale.edu and serv1.net.yale.edu A 128.112.128.15, so it contacts 128.112.128.15 to obtain the address of www.yale.edu. In short, Princeton has been given control over the Yale web server.
As a general rule, before any data can be made available from a zone retrieved from a third party through AXFR, records that don't end with the zone name have to be discarded. There is no harm in AXFR clients discarding all such records upon receipt.
When I first mentioned this type of attack, the BIND 9 implementors claimed that it was safe for the ISP to provide the serv1.net.yale.edu information as glue for haven.princeton.edu. That claim is false, as the above analysis shows. Do some versions of BIND 9 fail to discard the poison? Is this yet another BIND vulnerability?
RFC 1034 and RFC 1035 specify the semantics of the Domain Name System: the DNS database is a collection of trees, containing nodes, containing record sets of various types. There is one tree for each class (IN, CH, etc.); at most one node in each tree for each name (aol.com, etc.); and at most one record set in each node for each type (A, MX, etc.).
In other words, the DNS database at any moment is a collection of record sets, indexed by class, name, and type. For example, there is one record set for class IN, name aol.com, type A. (That record set contains four IP addresses right now: 64.12.187.24, 205.188.145.213, 205.188.160.120, and 64.12.149.25.)
Of course, this semantic rule does not mean that copies of data around the Internet are magically equalized if they have the same class, name, and type. Most of the copying protocols have reliability problems, producing accidental (though usually harmless) inconsistencies. Often people deliberately introduce inconsistencies---for example, giving different answers to different clients. (This is called ``client differentiation'' in tinydns, and ``views'' in BIND 9.)
What the semantic rule means is that implementations are allowed to store record sets by class, name, and type. If an implementation is faced with two record sets of the same class, name, and type, it is allowed to throw one record set away. Three examples:
To summarize: The fact that one DNS implementation indexes its database by something more than class+name+type does not oblige other DNS implementations to do the same thing.
As discussed above, the ISP is free to assume that the princeton.edu zone contains the same cs.princeton.edu data as the cs.princeton.edu zone. If the princeton.edu AXFR server says
cs.princeton.edu NS cs.princeton.edu cs.princeton.edu NS engram.cs.princeton.eduwhile the cs.princeton.edu AXFR server says
cs.princeton.edu NS dns1.cs.princeton.edu cs.princeton.edu NS dns2.cs.princeton.eduthen the ISP is free to discard the first record set in favor of the second. If the Princeton administrators don't like this for some reason, too bad; it's their fault for not following the DNS semantics specified in RFC 1034.
As a general rule, when AXFR results for w.x.y.z and y.z are combined, any record sets in the y.z zone whose names end with w.x.y.z can be discarded.
BIND 9, unlike most DNS server installations on the Internet, keeps track of both record sets. Of course, it is forced to throw one record set away when it responds to queries; a query asks for a record set by class+name+type.
In yet another demonstration of their astonishing disregard for interoperability, the BIND company is pushing an optional ``IXFR'' protocol that assumes the BIND 9 behavior. They say that IXFR breaks horribly when it encounters the standard RFC 1034 behavior. Normal people conclude that IXFR is broken and that IXFR has to be fixed. (I've explained elsewhere how to fix it.) But the BIND company is instead demanding in its ``AXFR clarifications'' that most AXFR clients on the Internet be changed to accommodate IXFR. This is insane.
BIND 8.2.3 applies this rule not only to zones added manually by the system administrator, but also to data received through AXFR, so it creates interoperability problems.
There are many ways to see that the BIND 8.2.3 rule is flawed. Here's the easiest: What happens if the IETF adds another address type beyond A/AAAA/A6? Answer: a zone administrator who adds a record of that type causes a complete zone-transfer failure with BIND 8.2.3. This is even worse than BIND's other problems with new types, because it kills the whole zone transfer, not just the new records.
As a general rule, AXFR clients cannot assume that they will be able to figure out why a record was included in a zone.
Zone creators probably don't need to worry about the BIND 8.2.3 bug. Subsequent versions of BIND (including BIND 9) have reportedly been fixed; and all BIND 8.2.3 users are required to upgrade, because BIND 8.2.3 also has a remotely exploitable buffer overflow allowing attackers to take over the machine.