Interesting. The euro symbol displays fine but depending on which utility I use to display, I do get some blobs to indicate undisplayable characters. Thus in the line which starts into parameter valu .... and ends .... how to encode non-ASCII characters the 'e' of value is not an ASCII 'e' and my attempts to cut and paste the line into this e-mail fails after the 'u' of 'values' suggesting that there is a line terminator in there. Unless - of course - that letter I presume is 'e' is intended to be a glyph outside normal ASCII with a line terminator function. Tom Petch ----- Original Message ----- From: "Heather Flanagan (RFC Series Editor)" <rse@xxxxxxxxxxxxxx> To: <ietf@xxxxxxxx> Sent: Friday, September 15, 2017 5:05 PM Subject: Re: RFC Series publishes first RFC with non-ASCII characters > On 9/15/17 3:16 AM, Masataka Ohta wrote: > > Heather Flanagan (RFC Series Editor) wrote: > > > >> RFC 8187, "Indicating Character Encoding and Language for HTTP Header > >> Field Parameters", is the first RFC to be published with UTF-8 encoding > >> and include characters not in the basic ASCII character set. > > > > Don't do that. > > > > It is as stupid as allowing programming languages use non ASCII > > characters. > > > > At first, it seems to be working. However, ultimately, it makes > > maintenance of code/rfc impossible, unless all the people > > maintaining the code/rfc can recognize all the characters > > in the code/rfc. > > Non-ASCII characters are not trivial to include in a document, at least > if you want to make sure the document is broadly readable. So, yes, this > is an area fraught with peril. However, quite a bit of time was put into > determining what guidance should be applied so that we can handle those > characters. See https://www.rfc-editor.org/rfc/rfc7997.txt. > > > > >> This > >> document has been, with the author's consent, patience, and support, > >> used to test the existing tool chain to produce RFCs to see where the > >> environment has difficulty in handling non-ASCII characters. > > > > That's not a problem. Problem is in human capability not to be able > > to recognize all the characters in the world. > > > > Internationalized code/rfc must be written using characters recognized > > by all the international people. > > > > Even if someone write a localized code for some locale, it should be > > written as: > > > > #define NBSP '\240' > > ... > > putchar(NBSP); > > > > not > > > > putchar('\240'); > > > > to ease maintenance. > > > > Masataka Ohta > > I don't think we an ignore these characters - they are in use, and we > need to be able to represent them in a more readable fashion than just > Unicode escape sequences. > > -Heather > > > > > PS > > > > Language C using full ASCII is already a problem because ASCII back > > slash character in '\240' is displayed as YEN sign of JIS X 0201 > > (Japanese variant of ISO 646) on almost all computers (including mine > > I'm using now to write this mail) in Japan. It is not a serious > > problem in Japan because all the Japanese are taught that YEN sign > > is an escape character of C. But many Japanese who can use C do not > > know it is actually ASCII back slash. > > >