--On Tuesday, 04 January, 2005 12:52 -0500 John Cowan <jcowan@xxxxxxxxxxxxxxxxx> wrote: > John C Klensin scripsit: > >> Returning to the DNS/IDN situation, ICANN has created a >> recommendation for all TLDs, and a requirement on at least >> some gTLDs, that languages not be mixed within a label and for >> registration and use of tables similar to those recommended by >> RFC 3743. > > This regulation is going to be completely unenforceable, since > with a few exceptions (hexagonal French), languages do not > have bright-line rules saying what words they do and do not > contain. Are we to be in the position of saying that > eigenvector.com may be registered (and is) because the word > appears in dictionaries, whereas eigenevent.com is ruled out > because it "mixes" English and German? John, I am sure that ICANN would welcome your participation as the various rules/ guidelines evolve -- those rules are not an IETF problem, even though changes to the standard that is used to label them might be. One of the things their processes have in common with the IETF is that they prefer that people actually try to read and understand documents before attacking them, but I suppose there are always exceptions. In particular, the recommendations of RFC 3743 are about tables of characters, not dictionary lookup. If, however, a domain decided to adopt a canonical dictionary and lookup in it as a registration criterion, that rule would be perfectly enforceable. I'd recommend against it for many reasons, but this would be more or less up to them. > Forbidding the mixing of scripts is another matter, although > in fact some languages are written using more than one > (Unicode) script. Whether those languages are a problem or not in the DNS context depends on whether one wishes to permit a single label to use both (or all three in at least a few cases I know of) scripts. Again a per-registry decision and again perfectly enforceable either way. Other issues occur if the writing order of characters in a language obeys specific rules and one chooses to enforce them (a potential issue with, e.g., Hangul, although, again, the choice of whether or not to try to enforce is up to the registry). But one of the notational problems with using 3066 would be a rule that one can have a label that contains the characters of a given language written in, e.g., either a modified Arabic script or a modified Cyrillic one but not in a modified Roman ("Latin") one. Another issue arises when one wants to permit a character collection that includes the characters from a given script that are used by two separate languages -- not all of the characters of that script, but exactly those characters that fall into the union of the characters from the script used by the relevant languages. It is not clear that the current proposal is much better than 3066 for handling those cases, but I wonder if anyone has carefully evaluated whether it would make things worse. john _______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf