Re: [Last-Call] New Version Notification for draft-crocker-inreply-react-07.txt

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Dave, others,

On 03/03/2021 23:40, Dave Crocker wrote:
I'm finally able to get some time for this.  And I'm finding myself thinking of the interaction between ietf perspective and unicode perspective.  ietf perspective uses the term octet.  I think there can be some benefit in mixing the terms, to try to connect them, for the reader.

This may be an okay 50'000 feet high summary, but is in no way appropriate for an actual protocol spec. Also, there are many ietf specs that use Unicode code points and many parts of Unicode that use the term octet. The term octet is as appropriate e.g. for MTU or HTTP Content-Length as it is for the result of encoding characters in UTF-8. The term codepoit (or code point) is as appropriate e.g. in RFC 3987 as it is somewhere in the Unicode spec.

"There may be some benefit of mixing the terms" sounds extremely vague, and the text below turns out that way. The connection between these terms has to be very precise.

Consequently, I propose:


On 3/2/2021 6:34 AM, Ricardo Signes wrote:
One:

    The rule emoji_sequence is inherited from [Emoji-Seq].  It defines a
    set of octet sequences, each of which forms a single pictograph.

I would replace "octet" with "code point".  The referenced document only describes sequences of code points.  The encoding of those into octets is orthogonal, and will be described by the content-type and content-transfer-encoding jointly.  So, I think this change is a definite improvement to accuracy, and is worth making.

NEW:

<t>The ABNF rule emoji_sequence is inherited from <xref target="Emoji-Seq"/>. It defines a set of octet sequences, each of which forms a single pictograph.

Sorry, this is wrong. The ABNF rule in target Emoji-Seq does not define a set of octet sequences. It defines a set of codepoint sequences, and whether or how they end up as octet sequences is undefined in that document.

The BNF syntax used in [Emoji-Seq] differs from <xref target="ABNF"/>, and MUST be interpreted as used in Unicode documentation. The referenced document describes these as sequences of code points.

So how do you get octets from those codepoints? You don't say. Please just use the wording that Ned provided. That wording makes things clear.



Two:

    Reference to unallocated code points SHOULD NOT be treated as an
    error; the corresponding octets SHOULD be processed using the system
    default method for denoting an unallocated or undisplayable code
    point.

I suggest the same change.  It's -maybe- more debatable.  But this

I find myself wanting to retain octet here.

It would be okay to leave it as is if you make the linkage explicit in the previous text by using the text provided by Ned.

Regards,   Martin.

Again, it makes a linkable between code point and octet explicit.  Further, this text involves raw data that can't be processed normally and octet has no semantics beyond saying 8-bit, whereas code point invokes substantial semantics.


Thoughts?

d/


--
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call




[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Mhonarc]     [Fedora Users]

  Powered by Linux