Re: Last Call: draft-klensin-net-utf8 (Unicode Format for NetworkInterchange) to Proposed Standard

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



John C Klensin wrote:

> It is ambiguous for HT.

Yes, but we typically don't care about this in protocols as
long as it behaves like one or more spaces.  I think that's
the idea of "WSP = SP / HTAB ; white space" in RFC 4234bis,
waiting for its STD number.

We talked about the 4234bis issue of "trailing white space",
which could cause havoc when it is silently removed, and a
"really empty line" is not the same as an "apparently empty
line" (i.e. CRLF CRLF vs. CRLF 1*WSP CRLF).

A similar robustness principle would support to accept old
"HTAB-compression" or "HTAB-beautification" (e.g. as first
character in a folded line).  In other words WSP, not only
SP.  It is clear that the outcome is ambiguous, but in some
protocols I care about (headers in MIME, mail, and news)
*WSP or 1*WSP are acceptable.   Admittedly it is a pain when
signatures need white space canonicalization.  But replacing
*WSP by *SP would only simplify this step, not get rid of it.

 [About CRLF]
> Unicode 5.0, Section 5.8, provides significant insight into
> the complexity of this problem and probably should have
> been referenced.  It would be even more helpful had Table
> 5-2 included identifying CRLF as a standard Internet "wire"
> form of NLF, not just binding that form to Windows.

Indeed, this chapter offers significantly *broken* insight
for our purposes.  What they found was a horrible mess, then
they introduced wannabe-unambiguous LS + PS, and what they
arrived at was messier than before.  Claiming that CRLF is 
"windows" is odd for DOS + OS/2 users, it is also at odds
with numerous Internet standards - precisely the reason why
we need your draft.  

The chapter talks about line and paragraph separators without
mentioning relevant ASCII controls such as RS.  On the other
hand it mentions MS Word interna which are nobody's business
outside of MS Word.  It is interesting, but IMO unusable for
net-utf8.

 Frank


_______________________________________________

Ietf@xxxxxxxx
https://www1.ietf.org/mailman/listinfo/ietf


[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Fedora Users]