--On Tuesday, February 21, 2017 16:04 +0000 "tom p." <daedulus@xxxxxxxxxxxxx> wrote: > OK on 'Updates' > > Meanwhile, > s.2.2 'the basic Latin repertoire [RFC20] ' > I don't know what you mean - RFC20 does not use the term 'Latin > repertoire' and, being European, I tend to think of the Latin > repertoire as being the 160 or so characters of the Belgian > language. Mea cupla. I knew that unqualified reference was going to get us into trouble and let it go. Whether you (and future readers of 2141bis) should have known this (European or not) is debatable, but "Basic Latin" is a term of art in internationalization work and "repertoire" keeps us away from debates about particular coded character sets, encodings, etc., and, in particular, avoids confusion with %-encoded arbitrary Unicode characters (see the introduction to Section 2) which are certainly still ASCII characters. The definition of "basic Latin repertoire" is more or less equivalent to "undecorated Latin characters", but that terminology raises other issues... to the point that I'm tempted to say "can't win no matter what one does". Personally, I've always objected to "Basic Latin" because several characters we consider to be part of the repertoire today (e.g., "w" and distinct "j" and "u") were late additions and would not have been part of the writing system as distinct characters for writers of Latin in the time or, e.g., Virgil. But that argument was lost long ago and "basic Latin" is the prevailing terminology today. > I think you need to specify the code points here. I think that would cause confusion with syntax rules that specify what is possible. The paragraph in which the phrase you picked up simply imposes a "SHOULD NOT" restriction and explains why it should be adhered to when possible and what the exceptions should be. It really clarifies and is perhaps partially redundant with the previous paragraph. How would you feel about making the phrase something closer to "the basic Latin repertoire, i.e., the letters and digits of ASCII as described above" and moving the RFC 20 citation to the first use of "ASCII" in that previous paragraph? Other textual suggestions welcome but, again, I think listing code points would cause confusion with the formal syntax for <namestring> and some of the carefully-constructed language around it. john p.s. personal note: 2141bis has either the advantage of, or suffers from, the fact that Peter and I have been immersed in internationalization (i18n) issues for the last several years with various involvements in PEECIS, IDNA, SMTPUTF8 ("EAI"), and other efforts. That has probably resulted in our trying to be a little more precise in this draft than many IETF documents about URI schemes and naming and perhaps, as in this case, having slightly higher expectations about community understanding of those issues than might be the case. I think that precision is an advantage and hope that community knowledge and understanding will gradually improve. YMMD, but I hope than anyone who is actually inclined to advocate for less precision (you have not) be careful about any advocacy, even on a diversity basis, for IETF consideration of writing systems that cannot be expressed in ASCII and/or any language other than English.