--On Thursday, December 18, 2014 15:16 +0100 Julian Reschke <julian.reschke@xxxxxx> wrote: >... > So RFC 20 says it defines a "coded character set". However, in > current specs (at least in APPS) we frequently talk about > "character encoding schemes" > (<http://tools.ietf.org/html/rfc6365#section-2>), in general > mapping Unicode code points to octet sequences. > > So does RFC 20 define a CES as well? If it does not, should we > have an additional document taking care of this? With the understanding that RFC 20's being used successfully for 45 years continues to be a very strong argument that it doesn't need changes or supplemental materials, and noting that many IETF participants were not reading ANSI/USASA standards (or much of anything else) 45 years ago, (1) The terms "coded character set" and "[character] code for information interchange" were in use for long before Unicode and its multiple encodings/ representation forms started to redefine it/them. In this context, "long" is measured in decades, not years. (2) Early versions of ASCII did not specify what we would now call "encoding" information. It just specified repertoire and associated 7 bit CCS. Late ones, IIR, do specify encoding information. That type of difference is one of the reasons we need to be careful about version numbers or dates when referencing other people's standards (and why stable references are important). For ASCII, the result was that we ended up with at least two different ways to put those 7 bit characters into an 8 bit "byte" and at least two different ways to put them into a 36 bit word. (3) I assume partially because the encoding issues mentioned in (2) had most people working on anything resembling applications on the network to be familiar with the issues, RFC 20 does specify an on-the-wire encoding for ASCII. That is one of the things that makes it more useful than a reference to ASCII alone: it specifies what we started calling a "charset" in the early MIME days, i.e., a combination between a CCS and a CES. So, AFICT, nothing else needed in this area other than getting on with it and ceasing to embarrass ourselves by needing to drag out this discussion of a 45-year-old spec of something we've used heavily and for which there has never been a problem for what is now called the Basic Latin repertoire. best, john