> > Rather, the rule is simply that a country code, if present, > > always appears as a two letter second subtag. The new draft changes this > rule, > > so applications that pay attention to coutnry codes in language tags have > to > > change and the new algorithm for finding the country code is trickier. > Your text above says (a) "if there is a country code in the tag, it is the > second subtag". That is not what text of RFC 3066 actually says, which is: > > The following rules apply to the second subtag: > > All 2-letter subtags are interpreted as ISO 3166 alpha-2 country... > That is, it says (b) "if a second subtag has 2 letters, then it is an ISO > 3166 code", which is not the same as (a). (It is almost, but not quite, the > converse.) Fine, whatever. > The current RFC certainly does not forbid the use of country > codes in other positions in language tags. One could absolutely register > en-Latin-US, for example, meaning English as spoken in the US written in > Latin script. Sure, but my point was, is, and always has been that any 3066-compliant implementation won't see this as a country code (unless it is table driven, which brings up its own set of issues). > There has been a lot of noise on this issue, and too few concrete examples. No, what there has been is a lot of discussion of a real problem with no apparent recognition of it as such by the draft authors. Your pejorative characterization of this as "noise" does not make it so. > In the so-called 3066bis draft, we have striven very hard to ensure that: > (c) Every single tag that could be generated under RFC 3066bis is a tag that > could have been registered under RFC 3066. True but irrelevant. > Thus if someone wrote a parser that is future-compatible -- that could parse > all RFC 3066 language tags including those registered after the parser was > deployed -- then that parser can handle all 3066bis language tags. This is a > significant advance over RFC 3066, whose registered (not generated) language > tags are atomic, and cannot be effectively parsed at all. 3066bis adds more > structure so as to allow effective parsing of tags. > If you *can* come up with tags that would show that (c) is invalid, that > would be a concrete case that we would have to make adjustments in the draft > for. (c) is frankly not an issue I care one whit about. (Perhaps I should, but I don't.) I don't register tags. I write code that processes, and more to the point matches, tags. That's why I have issues with this draft. > Moreover, all the talk about this being *too* complex is far overblown. Again, your pejorative dismissal of other people's concerns does not mean your position is valid. > All > 3066bis language tags can be parsed, including all the grandfathered codes, > with a very short piece of code, or even with a regular expression (such as > in Perl). Of course you can write a short piece of code to parse this stuff. It's what you do with it after you parse it that's a problem. > This is not rocket science. Parsing almost never is. But simply parsing these tag is not, and never has been, the issue. Ned _______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf