> ned.freed@xxxxxxxxxxx scripsit: > > I know of two other wrinkles in the RFC 1766 world: > Are you aware that RFC 1766 has been obsolete for four years now? Of course I am. > > (2) SGN- requires special handling, in that SGN-FR and SGN-EN are in fact > > sufficiently different languages that a primary tag match should not be > > taken to be a generic match. > The same is true of the various registered zh-* tags. Yes, forgot to mention that one. It is actually different and more important in that the use-cases aren't the same as those for sign languages. > > (a) Extension tags appear as the first subtags, and as such have to > > be taken into account when looking for country subtags. > Finding country codes is straightforward: any non-initial subtag of two letters > (not appearing to the right of "x-" or "-x-") is a country code. > This is true in RFC 1766, RFC 3066, and the current draft. On the contrary, in RFC 3066 the rule is "any 2 letter value that appears as the second subtag is a country code". The rule in the new draft is either the formulation you give above or "any 2 letter value that appears as a subtag after the initial subtag and some number of 3 and 4 letter subtags". These aren't the same. > > (b) Script tags change the complexion of the matching problem significantly, > > in that they can interact with external factors like charset information > > in odd ways. > Can you clarify this? Charset information neither specifies nor necessarily > restricts (except in text/plain) the script used to write a document. And what if you're dealing with text/plain, as many applicationss do? Just because something doesn't necessarily do something doesn't mean it never does it. > > (c) UN country numbers have been added (IMO for no good reason), requiring > > handling similar to country codes. > They provide for supranational language varieties and for stability in > country codes which is inappropriate for ISO 3166 alphabetic codes (which > are codes for country *names*). I'm aware of what they provide (although I see no explanation of this in the draft). I'm just not convinced that their addition is warranted. > > The bottom line is that while I know how to write reasonable code to do RFC > > 1766 matching (and have in fact done so for widely deployed software), I > > haven't a clue how to handle this new draft competently in regards to > > matching. > The draft describes only the RFC 1766 (3066) algorithm, without excluding > other algorithms to be defined later. Well, maybe I'm missing something obvious, but I see nothing in RFC 3066 that qualifies as a description of a matching algorithm. The new draft does include such a description in section 2.4.2 - an improvement - but leaves any number of details open. And we all know where the devil lives. Side note: I don't think item 4 really belongs in the list in section 2.4.2. It is a warning to implementors, not part of the matching mechanism. Ned _______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf