Re: [Last-Call] New Version Notification for draft-crocker-inreply-react-07.txt

John C Klensin <john@xxxxxxx> · Tue, 02 Mar 2021 14:36:37 -0500

To both One and Two:  

Exactly.  See the aside in my note to Ned for details of my
explanation, which parallels yours.

thanks,
   john

--On Tuesday, 02 March, 2021 09:34 -0500 Ricardo Signes
<rjbs@semiotic.systems> wrote:

> On Tue, Mar 2, 2021, at 9:15 AM, John C Klensin wrote:
>> I don't know whose concern was to make that particular switch
>> and why, but my concern about either (and, I'm guessing,
>> Martin's) is that almost all Unicode code points (those
>> outside the ASCII range) require more than one octet to
>> represent in any encoding scheme.  For UTF-8, which the I-D
>> requires, the number of octets is variable.  So using "octet"
>> as a unit of --well, much of anything--is, at best, confusing.
> 
> "octet" appears in two places.
> 
> One:
> 
>    The rule emoji_sequence is inherited from [Emoji-Seq].  It
> defines a    set of octet sequences, each of which forms a
> single pictograph.
> 
> I would replace "octet" with "code point".  The referenced
> document only describes sequences of code points.  The
> encoding of those into octets is orthogonal, and will be
> described by the content-type and content-transfer-encoding
> jointly.  So, I think this change is a definite improvement to
> accuracy, and is worth making.
> 
> Two: 
> 
>    Reference to unallocated code points SHOULD NOT be treated
> as an    error; the corresponding octets SHOULD be processed
> using the system    default method for denoting an unallocated
> or undisplayable code    point.
> 
> I suggest the same change.  It's -maybe- more debatable.  But
> this document is describing what to do with the decoded
> content, because it doesn't describe anything about C-T-E or
> charset decoding.  We must assume that the decoding layer has
> done its job and now we either have a total error or a
> codepoint sequence.  (Some decode layers will have been
> instructed to hand back REPLACEMENT CHARACTER when the octet
> sequence was mangled, which will not be a valid emoji
> sequence, and everything works out.)

-- 
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call