Resuming my comments: > -----Original Message----- > From: ietf-languages-bounces@xxxxxxxxxxxxx [mailto:ietf-languages- > bounces@xxxxxxxxxxxxx] On Behalf Of Bruce Lilly [snip] > Specifically, the draft allows, and RFC 3066 disallows: > subtags more than 8 octets in length > hyphens which do not separate subtags > zero-length subtags > primary tags which are not purely alphabetic > Curiously, all of those are permitted by the draft ABNF > production "grandfathered"... The "grandfathered" production in the current draft is grandfathered = ALPHA *(alphanum / "-") which does permit the sequences claimed by Bruce (except for not-purely-alphabetic primary sub-tags), syntactically; but the set of tags available for use is constrained by more than the ABNF syntax alone: the acceptable productions for each sub-tag must either be taken from one of the source standards or be registered. This is no different from RFC 3066, so it is no more of a problem in this specification than it was in RFC 3066. It might be that the wording in 2.2 could be tightened up to eliminate any possible question regarding the source for "grandfathered" productions. Maybe it's not as obvious to someone coming to this cold as it for us who have been discussing it for the past year. Alternately, there's no reason why the "grandfathered" production shouldn't be composed exactly to match what was used in RFC 3066: grandfathered = 1*8ALPHA *("-" 1*8alphanum) So, perhaps there is room for technical improvement, but there are not any serious problems IMO -- certainly nothing as serious as the tone of Bruce's conveyed. > I see no reason for the ABNF to permit such content as is > forbidden by RFC 3066; the actual ABNF for what RFC 3066 > permits is contained within 3066, and could have been directly > incorporated rather than producing a "grandfathered" > production which opens up several cans of worms. This vastly overstates the problem. There is no can of worms unless it exists in tags currently available under RFC 3066. > One defect related to tag length in RFC 3066 is not remedied > by the draft; indeed the problem is greatly exacerbated... > Unfortunately, a language- tag's length is unlimited by > the ABNF in RFC 3066 (due to an unlimited number of subtags) > and in the draft... > In particular, tags other than private-use tags with more than > two subtags require registration under RFC 3066 rules, and it > is a trivial matter to determine the longest registered tag. > The draft, however, encourages use of more subtags as well as > removal of the subtag length upper bound; moreover, it permits > infinite numbers of subtags without requiring registration of > the resulting complete tag. Bruce states incorrectly that there is no upper bound on the length of sub-tags. His other concern, on the overall length of complete tags, is valid, however: in terms of the ABNF syntax for both RFC 3066 and RFC 3066bis, infinite-length productions are possible, but RFC 3066 would require registration of complete non-private-use tags while RFC 3066bis does not. There are three open doors for infinite-length productions in the ABNF of the current draft: - unlimited extlang sub-tags - unlimited variant sub-tags - the number of possible extensions is limited to 25, but the length of extensions is unlimited We could impose some upper limits on these things; e.g. Language-Tag = ... *8("-" extlang) ... *8("-" variant) ... 1*25("-" extension) ... extension = singleton 1*8("-" 2*8alphanum) If we also imposed limits on the length of private-use tags and defined the grandfathered production in a way that made clear there was an upper limit for those, then we could end up eliminating an issue that had existed in RFC 3066. So, I think Bruce has identified a valid issue here. I personally would not have characterized it as greatly exacerbating, though, as the issue was present in RFC 3066: private-use tags did not need to be registered in RFC 3066, so there was no way in implementation could be written with certain knowledge that tags beyond some given length would not be encountered. > > The new registry provides a complete, > > easily parseable file which provides the precise the contents of valid tags for > > any point in time. > > That is the first time I have ever heard ISO 8601 date > format described as "easily parseable". Perhaps the draft > authors meant to say that a specific subset of the tortuously > complex ISO 8601 date format is used, but that is not what > the draft states... It seems very clear that the authors intended only a specific subset: YYYY-MM-DD. This is a minor technical issue that the authors can very easily remedy. > I am absolutely shocked that a draft dealing with language > lacks an "Internationalization considerations" section as > recommended by RFC 2277 (a.k.a. BCP 18). No more or less shocking than for RFC 3066, regarding which I'm not aware of any complaints. I don't quite understand what the critique is here: what is there to internationalize about language tags? They are symbolic identifiers that have no culture-specific content. The only possible consideration is the charset, which for this spec involves ALPHA, DIGIT and "-" only. It's true that ALPHA and DIGIT are not defined and that it would be better to do so; it couldn't hurt to have a section for i18n considerations (wouldn't need to be long). These are very minor concerns, and hardly "shocking". > Perhaps even more disturbing is the content of the "IANA > Considerations" section; the draft predicts that certain things > will happen ("IANA will"[...]), but doesn't actually direct > (e.g. "IANA shall") IANA to do anything. The placement of that > section does not correspond to current RFC-Editor guidelines > (it should appear after Security Considerations); also on that > point, Appendices should precede References. There is a process issue here, but I have assumed that the authors have dealt with IANA on that. Otherwise, these are editorial issues -- "even more disturbing" seems to me to be somewhat overstated. > Many of the references are obsolete (e.g. RFCs 1327, > 1521)... and at least one reference ([19]) > gives a bracketed URI rather than the correctly formatted > RFC reference. Although reference is made to the "Accept- > Language" header field, RFC 3282 (the defining RFC for that > field) is not listed among the references... > The formatting of the draft is atrocious All editorial. > there is no differentiation between normative and > informative references, A valid concern. > I am extremely surprised that the draft has been published > at least nine times in such a state of poor formatting and > poor attention to editorial content (e.g. obsolete and > missing references), and that it progressed as far as IESG > last call in such a state, with no Internationalization > considerations section, etc. In fairness to the authors, page-oriented plain text is not exactly conducive to authoring and revising a long document, and a lot of energy was spent focusing on details that have far more consequence than formatting. And, as mentioned above, the lack of an i18n-concerns section is hardly without precident, and not particularly significant in the case of this spec. This really feels like nit-picking, IMO. I'm left wondering if Bruce has been looking for nits to pick because he is... > ... particularly concerned about the implementation > ramifications of the proposed changes, especially (as > noted in detail above): > 1. the apparent contradiction between the stated > objectives w.r.t. accessibility of relevant ISO data and > standards and the reality of the proposal's > implications (ISO 8601 date format parsing). As mentioned above, this really is a non-issue. > 2. the clear contradiction between the claims about > ABNF compatibility with RFC 3066 and the factual > incompatibility of certain provisions in the grammar. The main concern was with the "grandfathered" production, but I've shown that that is a non-issue. The maximal length issue exists just as much in RFC 3066 due to private-use tags; it is a technical concern that might worth reviewing in RFC 3066bis, however; but it is not insurmountable, and not a new problem. Peter Constable Microsoft Corporation _______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf