I'm finally able to get some time for this. And I'm finding myself
thinking of the interaction between ietf perspective and unicode
perspective. ietf perspective uses the term octet. I think there can
be some benefit in mixing the terms, to try to connect them, for the
reader. Consequently, I propose:
On 3/2/2021 6:34 AM, Ricardo Signes wrote:
One:
The rule emoji_sequence is inherited from [Emoji-Seq]. It defines a
set of octet sequences, each of which forms a single pictograph.
I would replace "octet" with "code point". The referenced document only
describes sequences of code points. The encoding of those into octets
is orthogonal, and will be described by the content-type and
content-transfer-encoding jointly. So, I think this change is a
definite improvement to accuracy, and is worth making.
NEW:
<t>The ABNF rule emoji_sequence is inherited from <xref
target="Emoji-Seq"/>. It defines a set of octet sequences, each of which
forms a single pictograph. The BNF syntax used in [Emoji-Seq] differs
from <xref target="ABNF"/>, and MUST be interpreted as used in Unicode
documentation. The referenced document describes these as sequences of
code points.
Two:
Reference to unallocated code points SHOULD NOT be treated as an
error; the corresponding octets SHOULD be processed using the system
default method for denoting an unallocated or undisplayable code
point.
I suggest the same change. It's -maybe- more debatable. But this
I find myself wanting to retain octet here. Again, it makes a linkable
between code point and octet explicit. Further, this text involves raw
data that can't be processed normally and octet has no semantics beyond
saying 8-bit, whereas code point invokes substantial semantics.
Thoughts?
d/
--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
--
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call