DNS Standards to Support Wide-Area Bonjour
Over the past few weeks I've been trying to move my DNS services to third-party DNS providers, like DreamHost, FreeDNS, and DynDNS. So far, I haven't found a provider that accepts arbitrary data for the various resource records' values (including spaces), severely limiting how nice wide-area Bonjour can be. I've been able to advertise services, they just look bad.
For more information, consult these pages on Bonjour and Wide-Area Bonjour.
Here I've tried to collect relevant parts of DNS-related RFCs to provide evidence that DNS service providers should support arbitrary characters in their DNS records.
Summary: there’s no reason a DNS provider should restrict the content of any resource record name or data field, except for its length. Those providers who use HTML forms for configuration can also make it easy to use UTF-8 text by just accepting what’s entered into each form field, and handle conversion to BIND configurations themselves (if that’s their implementation).
RFC1035 Domain Names—Implementation and Specification
From 3.3. “Standard RRs,†page 12:
<domain-name> is a domain name represented asa series of labels, and terminated by a label with zero length. <character-string> is a single length octet followed by that number of characters. <character-string> is treated as binary information, and can be up to 256 characters in length (including the length octet).
3.3.12. “PTR RDATA format,†page 17:
PTRDNAME |
where:
PTRDNAME | A <domain-name> which points to some location in the domain name space. |
---|
PTR records cause no additional sectionprocessing. These RRs are used in special domains to point to some other location in the domain space. These records are simple data, and don't imply any special processing similar to that performed by CNAME, which identifies aliases. See the description of the IN-ADDR.ARPA domain for an example.
From 5.1. “Format,†page 33:
<domain-name>s make up a large share of the data in the master file. The labels in the domain name are expressed as character strings and separated by dots. Quoting conventions allow arbitrary characters to be stored in domain names. Domain names that end in a dot are called absolute, and are taken as complete. Domain names which do not end in a dot are called relative; the actual domain name is the concatenation of the relative part with an origin specified in a $ORIGIN, $INCLUDE, or as an argument to the master file loading routine. A relative name is an error when no origin is available.
…
<character-string> is expressed in one or two ways: as a contiguous set of characters without interior spaces, or as a string beginning with a " and ending with a ". Inside a " delimited string any character can occur, except for a " itself, which must be quoted using \ (back slash).
…
Because these files are text files severalspecial encodings are necessary to allow arbitrary data to be loaded. In particular:
of the root. | |
@ | A free standing @ is used to denote the current origin. |
\X | where X is any character other than a digit (0-9), is used to quote that character so that its special meaning does not apply. For example, "\." can be used to place a dot character in a label. |
\DDD | where each D is a digit is the octet corresponding to the decimal number described by DDD. The resulting octet is assumed to be text and is not checked for special meaning. |
( ) | Parentheses are used to group data that crosses a line boundary. In effect, line terminations are not recognized within parentheses. |
; | Semicolon is used to start a comment; the remainder of the line is ignored. |
RFC2181 “Clarifications to the DNS Specificationâ€
Section 11. “Name Syntaxâ€:
Occasionally it is assumed that the Domain Name System serves only the purpose of mapping Internet host names to data, and mapping Internet addresses to host names. This is not correct, the DNS is a general (if somewhat limited) hierarchical database, and can store almost any kind of data, for almost any purpose.
The DNS itself places only one restriction on the particular labels that can be used to identify resource records. That one restriction relates to the length of the label and the full name. The length of any one label is limited to between 1 and 63 octets. A full domain name is limited to 255 octets (including the separators). The zero length full name is defined as representing the root of the DNS tree, and is typically written and displayed as ".". Those restrictions aside, any binary string whatever can be used as the label of any resource record. Similarly, any binary string can serve as the value of any record that includes a domain name as some or all of its value (SOA, NS, MX, PTR, CNAME, and any others that may be added). Implementations of the DNS protocols must not place any restrictions on the labels that can be used. In particular, DNS servers must not refuse to serve a zone because it contains labels that might not be acceptable to some DNS client programs. A DNS server may be configurable to issue warnings when loading, or even to refuse to load, a primary zone containing labels that might be considered questionable, however this should not happen by default.
Note however, that the various applications that make use of DNS data can have restrictions imposed on what particular values are acceptable in their environment. For example, that any binary label can have an MX record does not imply that any binary name can be used as the host part of an e-mail address. Clients of the DNS can impose whatever restrictions are appropriate to their circumstances on the values they use as keys for DNS lookup requests, and on the values returned by the DNS. If the client has such restrictions, it is solely responsible for validating the data from the DNS to ensure that it conforms before it makes any use of that data.
See also [RFC1123] section 6.1.3.5.
RFC1123 section 6.1.3.5:
6.1.3.5 Extensibility
DNS software MUST support all well-known, class-independent formats [DNS:2], and SHOULD be written to minimize the trauma associated with the introduction of new well-known types and local experimentation with non-standard types.
DISCUSSION:
The data types and classes used by the DNS are extensible, and thus new types will be added and old types deleted or redefined. Introduction of new data types ought to be dependent only upon the rules for compression of domain names inside DNS messages, and the translation between printable (i.e., master file) and internal formats for Resource Records (RRs).
Compression relies on knowledge of the format of data inside a particular RR. Hence compression must only be used for the contents of well-known, class-independent RRs, and must never be used for class-specific RRs or RR types that are not well-known. The owner name of an RR is always eligible for compression.
A name server may acquire, via zone transfer, RRs that the server doesn't know how to convert to printable format. A resolver can receive similar information as the result of queries. For proper operation, this data must be preserved, and hence the implication is that DNS software cannot use textual formats for internal storage.
The DNS defines domain name syntax very generally—a string of labels each containing up to 63 8-bit octets, separated by dots, and with a maximum total of 255 octets. Particular applications of the DNS are permitted to further constrain the syntax of the domain names they use, although the DNS deployment has led to some applications allowing more general names. In particular, Section 2.1 of this document liberalizes slightly the syntax of a legal Internet host name that was defined in RFC-952 [DNS:4].