Re: RFC Series publishes first RFC with non-ASCII characters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Interesting.

The euro symbol displays fine but depending on which utility I use to
display, I do get some blobs to indicate undisplayable characters.  Thus
in the line which starts
   into parameter valu ....
and ends
.... how to encode non-ASCII characters

the 'e' of value is not an ASCII 'e' and my attempts to cut and paste
the line into this e-mail fails after the 'u' of 'values' suggesting
that there is a line terminator in there.

Unless - of course - that letter I presume is 'e' is intended to be a
glyph outside normal ASCII with a line terminator function.

Tom Petch

----- Original Message -----
From: "Heather Flanagan (RFC Series Editor)" <rse@xxxxxxxxxxxxxx>
To: <ietf@xxxxxxxx>
Sent: Friday, September 15, 2017 5:05 PM
Subject: Re: RFC Series publishes first RFC with non-ASCII characters


> On 9/15/17 3:16 AM, Masataka Ohta wrote:
> > Heather Flanagan (RFC Series Editor) wrote:
> >
> >> RFC 8187, "Indicating Character Encoding and Language for HTTP
Header
> >> Field Parameters", is the first RFC to be published with UTF-8
encoding
> >> and include characters not in the basic ASCII character set.
> >
> > Don't do that.
> >
> > It is as stupid as allowing programming languages use non ASCII
> > characters.
> >
> > At first, it seems to be working. However, ultimately, it makes
> > maintenance of code/rfc impossible, unless all the people
> > maintaining the code/rfc can recognize all the characters
> > in the code/rfc.
>
> Non-ASCII characters are not trivial to include in a document, at
least
> if you want to make sure the document is broadly readable. So, yes,
this
> is an area fraught with peril. However, quite a bit of time was put
into
> determining what guidance should be applied so that we can handle
those
> characters. See https://www.rfc-editor.org/rfc/rfc7997.txt.
>
> >
> >> This
> >> document has been, with the author's consent, patience, and
support,
> >> used to test the existing tool chain to produce RFCs to see where
the
> >> environment has difficulty in handling non-ASCII characters.
> >
> > That's not a problem. Problem is in human capability not to be able
> > to recognize all the characters in the world.
> >
> > Internationalized code/rfc must be written using characters
recognized
> > by all the international people.
> >
> > Even if someone write a localized code for some locale, it should be
> > written as:
> >
> > #define NBSP '\240'
> > ...
> > putchar(NBSP);
> >
> > not
> >
> > putchar('\240');
> >
> > to ease maintenance.
> >
> > Masataka Ohta
>
> I don't think we an ignore these characters - they are in use, and we
> need to be able to represent them in a more readable fashion than just
> Unicode escape sequences.
>
> -Heather
>
> >
> > PS
> >
> > Language C using full ASCII is already a problem because ASCII back
> > slash character in '\240' is displayed as YEN sign of JIS X 0201
> > (Japanese variant of ISO 646) on almost all computers (including
mine
> > I'm using now to write this mail) in Japan. It is not a serious
> > problem in Japan because all the Japanese are taught that YEN sign
> > is an escape character of C. But many Japanese who can use C do not
> > know it is actually ASCII back slash.
> >
>




[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Fedora Users]