>> .....Due to the ASCII character encoding being the core/monopolyFrom draft-ietf-idnabis-protocol-03.txt Section 6.1: The current update to the definition of the DNS protocol [RFC2181] explicitly allows domain labels to contain octets beyond the ASCII range (0000..007F), and this document does not change that. Note, however, that there is no defined interpretation of octets 0080..00FF as characters. If labels containing these octets are returned to applications, unpredictable behavior could result. The A-label form, which cannot contain those characters, is the only standard representation for internationalized labels in the current DNS protocol. As noted above, the DNS protocol does not prohibit the carrying of non-ASCII characters; the issue is the response of applications to receipt of such characters in responses. Presumably applications written to UNICODE APIs such as GetAddrInfoW are capable of handling UTF-8 in responses, and indeed there are many such applications (e.g. applications depending on .NET/mono DNS classes). > > presently you cannot have domain names that are multilingual, for > > example: japanese and english language mixed character domain names, > > hindi and english language mixed character domain names etc. > > Since it is an IETF mailing list, I will focus on what depends on > IETF, technical standards. There is *nothing* in the current IDN > standard (machine names in Unicode) that forbids such mixes. You may > refer to bad policies like ICANN IDN Guidelines, which apparently > forbid mixing scripts, but this had nothing to do with the IETF, > nothing to do with the protocols. From draft-ietf-idnabis-rationale-01.txt Section 14: To help prevent confusion between characters that are visually similar, it is suggested that implementations provide visual indications where a domain name contains multiple scripts. Such mechanisms can also be used to show when a name contains a mixture of simplified and traditional Chinese characters, or to distinguish zero and one from O and l. DNS zone administrators may impose restrictions (subject to the limitations identified elsewhere in this document) that try to minimize characters that have similar appearance or similar interpretations. It is worth noting that there are no comprehensive technical solutions to the problems of confusable characters. One can reduce the extent of the problems in various ways, but probably never eliminate it. Some specific suggestions about identification and handling of confusable characters appear in a Unicode Consortium publication [Unicode-UTR36]. This is *not* a prohibition, but rather a suggestion; Section 4 of the document contains no restriction on the registration of labels with mixed scripts. Similar advice can be found in RFC 3490 Section 10. > > Another example, there is not much browser / URL bar integration and > > usability innovation that allow for a non-ASCII language domain name > > to stay non-ASCII script on the browser / URL bar without it > > changing to Punycode. From draft-ietf-idnabis-rationale-01.txt Section 7.2: Applications MAY allow the display and user input of A-labels, but are encouraged to not do so except as an interface for special purposes, possibly for debugging, or to cope with display limitations. A-labels are opaque and ugly, and, where possible, should thus only be exposed to users and in contexts in which they are absolutely needed. Because IDN labels can be rendered either as the A-labels or U-labels, the application may reasonably have an option for the user to select the preferred method of display; if it does, rendering the U-label should normally be the default. Indeed, there are browsers (e.g. Safari) that actually follow this advice (and provide a more pleasant user experience as a result). |
_______________________________________________ Ietf@xxxxxxxx https://www.ietf.org/mailman/listinfo/ietf