On 9/15/17 3:16 AM, Masataka Ohta wrote: > Heather Flanagan (RFC Series Editor) wrote: > >> RFC 8187, "Indicating Character Encoding and Language for HTTP Header >> Field Parameters", is the first RFC to be published with UTF-8 encoding >> and include characters not in the basic ASCII character set. > > Don't do that. > > It is as stupid as allowing programming languages use non ASCII > characters. > > At first, it seems to be working. However, ultimately, it makes > maintenance of code/rfc impossible, unless all the people > maintaining the code/rfc can recognize all the characters > in the code/rfc. Non-ASCII characters are not trivial to include in a document, at least if you want to make sure the document is broadly readable. So, yes, this is an area fraught with peril. However, quite a bit of time was put into determining what guidance should be applied so that we can handle those characters. See https://www.rfc-editor.org/rfc/rfc7997.txt. > >> This >> document has been, with the author's consent, patience, and support, >> used to test the existing tool chain to produce RFCs to see where the >> environment has difficulty in handling non-ASCII characters. > > That's not a problem. Problem is in human capability not to be able > to recognize all the characters in the world. > > Internationalized code/rfc must be written using characters recognized > by all the international people. > > Even if someone write a localized code for some locale, it should be > written as: > > #define NBSP '\240' > ... > putchar(NBSP); > > not > > putchar('\240'); > > to ease maintenance. > > Masataka Ohta I don't think we an ignore these characters - they are in use, and we need to be able to represent them in a more readable fashion than just Unicode escape sequences. -Heather > > PS > > Language C using full ASCII is already a problem because ASCII back > slash character in '\240' is displayed as YEN sign of JIS X 0201 > (Japanese variant of ISO 646) on almost all computers (including mine > I'm using now to write this mail) in Japan. It is not a serious > problem in Japan because all the Japanese are taught that YEN sign > is an escape character of C. But many Japanese who can use C do not > know it is actually ASCII back slash. >