I'm having a little difficulty following the discussion of the text, so I
took a look at the text itself. It currently (-07) says:
The rule emoji_sequence is inherited from [Emoji-Seq]. It permits
one or more bytes to form a single presentation image.
The rule base-emojis MAY be used as a simple, common list, or
'vocabulary' of emojis. It was developed from some existing
practice, in social networking, and is therefore intended for use.
However support for it is not required. Having providers and
consumers employ a common set will facilitate user interoperability,
but different sets of users might want to have different, common
(shared) sets.
The emoji(s) express a recipient's summary reaction to the specific
message referenced by the accompanying In-Reply-To header field.
[Mail-Fmt].
Reference to unallocated code points SHOULD NOT be treated as an
error; associated bytes SHOULD be processed using the system default
method for denoting an unallocated or undisplayable code point.
There are a few problems here. First, TR#51 actually uses the term "image" in
an incompatible way - specifically, it's used to refer to emojis directly
represented by actual images, not ones experssed in Unicode itself. I think the
correct term to use here is "pictograph".
Second, since I'm one of those "old DEC-10 people" and know what ILDB and IDPB
mean, I think "octet" is a better term than "byte". But I can live with "byte"
if that's the consensus.
Third, as previously noted, you always need more than one octet to encode
an emoji, and the text should reflect that.
Fourth, there's a bit of awkwardness to the second paragraph. "intended for
use" where? What does support mean?
Finally, I think a couple of word choices could be better. So how about:
The rule emoji_sequence is inherited from [Emoji-Seq]. It
defines a set of octet sequences, each of which forms a single pictograph.
The rule base-emojis MAY be used as a simple, common list, or
'vocabulary' of emojis. It was developed from some existing
practice, in social networking, and is therefore intended for
such use. However support for it as a base vocabulary is not required.
Having providers and consumers employ a common set will facilitate
user interoperability, but different sets of users might want to have
different, common (shared) sets.
The emoji(s) express a recipient's summary reaction to the specific
message referenced by the accompanying In-Reply-To header field.
[Mail-Fmt].
Reference to unallocated code points SHOULD NOT be treated as an
error; the corresponding octets SHOULD be processed using the system
default method for denoting an unallocated or undisplayable code point.
Ned
On 1/27/2021 7:45 PM, Barry Leiba wrote:
>
> I thunk the text that Dave has is correct — certainly more correct than
> the suggestion, which would imply character composition rather than the
> image composition that’s being discussed here.
>
> I don’t think a change to the text will really help, but a “(for
> example, ...)” might.
Barry, thanks.
However to the extent that there's any misunderstanding of the text by
one thoughtful, experienced reader, there's likely to be more. I've no
idea how to make it clearer or more robust, though.
d/
--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
--
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call