On 1/28/2021 12:21 AM, Kjetil Torgrim Homme wrote:
On Wed, 2021-01-27 at 19:35 -0800, Dave Crocker wrote:
To the extent that your intent is to say that a) this is a subset of
UTF-8, and b) multiple bytes can be used, I think that's built into the
definition of emoji-sequence.
In fact, I had added the one or more text mostly to highlight the the
'sequence' can be only one byte, since 'sequence' would be expected to
be read as meaning multiple.
One small change here which will reduce the amount of confusion is to
avoid the word "byte". Indeed, it is *not* possible for the sequence
to be only one byte, since there are no Unicode code points in the
range U+0000 U+007F with the Emoji property set.
So, use "emoji characters" or "code points" instead?
(I tend to avoid the use of "byte" in favour of "octet" to forestall
complaints from the old DEC-10, DEC-20 and Cray users anyway ☺)
Well, indeed, my entrenched use of byte probably gets in the way, here...
I want the term to be more low-level and physical, than abstract or
conceptual. That is, I'd like the term to be outside of the Unicode
specialized terminology. To that end, I think octet works well.
Reference to unallocated code points SHOULD NOT be treated as an
error; associated bytes SHOULD be processed using the system default
method for denoting an unallocated or undisplayable code point.
Code points that do not have the requisite attributes to qualify as
part of an emoji_sequence should also not be treated as an error,
although you probably want to allow the system to alternatively
display them normally (rather than as an unallocated or undisplayable
code point).
I think your comment addresses a different issue than the cited text is
meant for, but I also might be misunderstanding.
Probably, but I think it bears saying something about how to handle
code points without the Emoji property set. IMHO they should be
handled as undisplayable.
This steps into user interface design, more than interoperable emoji
labeling and transport.
As such, it's outside of this specification and outside of the IETF's
expertise.
d/
--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
--
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call