--On Friday, 05 April, 2002 22:53 +0700 Robert Elz <kre@munnari.OZ.AU> wrote: > Date: Thu, 4 Apr 2002 09:50:01 -0800 (PST) > From: "Gary E. Miller" <gem@rellim.com> > Message-ID: > <Pine.LNX.4.44.0204040931110.10828-100000@catbert.rellim.com> > > | Maybe it can, but that does not make it right. > | > | RFC 1035 "DOMAIN NAMES - IMPLEMENTATION AND SPECIFICATION" > | > | 2.3.1 > > If you actually go read that section, carefully, instead of > just quoting the part from it that everyone notices first, you > will see that it says something quite different from what you > think it does. > > You need to read the part of the section that appears on the > preceding page of the formatted RFC... > > Or see (part of) rfc2181 for a longer verison of this. Actually, having read that section, and several other sections, _very_ carefully in recent months, I think 2181 is contradictory at best, and possibly seriously wrong, on this point. As I read them, what 1034 and 1035 say is that the DNS can accomodate any octets, but that [at least then] currently-specified RRs are restricted to ASCII. The LDH rule is a good ("best"?) practices one. It is the LDH rule that RFC 1123 modified slightly. And it is quite correct to assert that the LDH rule is not a _DNS_ requirement. But the ASCII rule is a firm requirement. For evidence of this, temporarily ignore the text (although, personally, I think it is clear -- especially in 2.3.3-- if read carefully) and examine the requirement that, for the defined RRs, labels and queries be compared in a case-insensitive way. For ASCII, that is a well-defined operation, one that can be performed by doing the comparison under a bit mask. For other scripts, as the IDN WG discovered, "case insensitive comparison" is typically not completely well-defined, often involves complex tables and/or knowledge of local context, and is sometimes quite controversial as to what is intended. So I believe that the "future RRs" language with regard to binary labels in 1034 and 1035 must be taken seriously and as normative text: if new RRs (or new classes) are defined, they can be defined as binary and, hence, as not requiring case-insensitive comparisons. Conversely, within the current set (or at least the historical set at the time of 1034/1035), case-insensitive comparison is required and hence binary must not be permitted. Any other reading, I believe, leads immediately either to contradictions or to undefined states within the protocol. As an aside, it appears to me that this requirement for case-insensitive comparison is the real problem with "just put UTF-8 in the DNS" approaches. An existing and conforming implementation has no way to do those required case-insensitive comparisons outside the ASCII range. Worse, if it does those comparisons by bit-masking (which would be conforming today), there is a risk of its getting rather bizarre errors (of either matching or not matching) on characters outside the ASCII range. One supposes that we could modify the protocol to specify that case-insensitive comparisions be made only for octets in the ASCII range, but, unless that were done through an EDNS option, it would be a potentially fairly significant retroactive change. john