RE: Last Call: draft-klensin-net-utf8 (Unicode Format for Network Interchange) to Proposed Standard

"Kent Karlsson" <kent.karlsson14@xxxxxxxxx> · Thu, 10 Jan 2008 09:59:43 +0100

Stephane Bortzmeyer wrote:
> > Upon reciept, the following SHOULD be seen as at least line ending
> > (or line separating), and in some cases more than that: 
> > 
> > LF, CR+LF, VT, CR+VT, FF, CR+FF, CR (not followed by NUL...),
> > NEL, CR+NEL, LS, PS
> 
> The whole point of the Internet-Draft on Net-UTF8 is to limit the size
> of the zoo of line endings. Accepting "everything in Unicode which
> looks like a line ending" seems strange to me. Do you know any
> Internet *protocol* which accepts several line endings? (Some Internet
> *applications* do so, in the name of the robustness principle, but for
> a protocol, I think it is a really bad idea.)

Please reread my comment. I wrote:
| Apart from CR+LF, these SHOULD NOT be emitted for net-utf8, unless
| that is overriden by the protocol specification (like allowing FF, or CR+FF).
| When faced with any of these in input **to be emitted as net-utf8**, each
| of these SHOULD be converted to a CR+LF (unless that is overridden
| by the protocol in question).

I.e. this is about conversion/normalisation of input that is *TO BE*
sent as Net-UTF-8.

As for the recieving side the same considerations as for the (SHOULD)
requirement (point numbered 4 on page 4) for NFC in Net-UTF-8 applies.
The reciever cannot be sure that NFC has been applied. Nor can it be
sure that conversion of all line endings to CR+LF (there-by loosing
information about their differences) has been applied.

	/kent k

_______________________________________________

Ietf@xxxxxxxx
https://www1.ietf.org/mailman/listinfo/ietf