--On Sunday, October 23, 2011 07:11 +0100 Dave CROCKER <dhc@xxxxxxxxxxxx> wrote: > >> Remember, in UTF-8, characters can be multiple octets. So 998 >> UTF-8 encoded *characters* are likely to be many more than >> 998 octets long. So the change is to say that the limit is in >> octets, not in characters. > > > The switch in vocabulary is clearly subtle for readers. (I > missed it too.) > > I suggest adding some language that highlights the point, > possibly the same language as you just used to explain it. In addition to what might be useful/ necessary for readers of 5335bis, in retrospect, we ought to have a prominent comment in one of the more generic i18n documents that highlights the fact that the, once one moves beyond ASCII, length-in-characters and length-in-octets, can no longer be assumed to be the same. When one is actually talking about storage length, length-in-characters should be prohibited from our vocabulary going forward. That would actually make an interesting extension to a nits-checker if someone could figure out how to do it or, at least, a flag to the RFC Editor about something they should watch out for. john _______________________________________________ Ietf mailing list Ietf@xxxxxxxx https://www.ietf.org/mailman/listinfo/ietf