Re: [Last-Call] New Version Notification for draft-crocker-inreply-react-07.txt

John C Klensin <john-ietf@xxxxxxx> · Wed, 03 Mar 2021 16:12:43 -0500

Ned,

At this point, I think we are suffering from a difference in
perspective rather than one the is more substantive.  To start,
I am convinced that the IETF could not fix any fundamental
issues with Unicode, including addressing "the many issues with
Unicode emoji", even if we had the expertise and energy to do so
-- and it is clear to me that we have neither -- so I am not
proposing to do that, in this document or anywhere else.  At the
same time, I see a tremendous amount of ignorance about those
issues and even about Unicode fundamentals around the IETF
(perhaps the other side of the "expertise and energy" coin) and
parts of the surrounding implementer community and do not see it
as desirable for a specification published in 2021 to encourage
that ignorance.  If I believed that the likely readers of this
specification were reasonably familiar with the issues, we would
not be having this discussion: a more or less general pointer in
the direction of Unicode and UTS#51 would be fine, just as
general pointers about ASCII have been for decades. If the
terminology was not quite right, everyone would know what was
intended and we'd just move on.  

However, from my point of view, the ignorance level is high
enough that the boundary between imprecise terminology or a
general pointer in the direction of Unicode and/or UTS#51 and a
trap for the unwary is rather thin.  

Some specifics below.

--On Wednesday, 03 March, 2021 07:02 -0800 Ned Freed
<ned.freed@xxxxxxxxxxx> wrote:

>...
>> On the other hand, the IETF has usually tried to avoid getting
>> sucked into, or even wanting to hear complaints about, user
>> interface issues for reasons I vividly remember Dave
>> explaining forcefully to me and several others many year ago.
>> Because of the semantics associated with emoji and their
>> binding to graphemes -- issues that do not arise with what we
>> normally consider "text", even text in logographic scripts --
>> this specification comes very close to that territory.  I
>> still think we should move ahead with it, but believe that
>> being clear that we are interested in those issue would be
>> useful (I'm not going to try to make a case for "critical").
>> However, I don't think it requires even an extra bullet; see
>> the first suggestion below.
> 
>> Two possible suggestions:
> 
>> (1)  Change the third bullet to read:
>  
>> 	Does the presence of the Reaction capability create any
>> 	operational problems, including problems associated with
>> 	the handling of emoji characters, for message
>> 	originators or recipients?
> 
> I can already answer this one: Yes, of course it will create
> problems. The use of inline emoji in regular email subject
> lines and text is creating problems as I write this; it's
> nonsensical to think that moving those characters to a
> different part with a slightly different label will change
> this in any way. Indeed, if this proves popular it will likely
> cause more problems, not less.

Then drop the text I suggested for Section 7 and add text to
Section 2 that is a different phrasing of what you wrote above,
i.e., "We know this will cause problems in many implementation
because emoji have been problematic in other email-related
contexts for some time; implementers and those who support
products or other software that implements this extension should
be aware of those issues.".   If you could provide a few
(entirely informative) references to supplement that, so much
the better.  Or, if you prefer, the text might be "we advise
against anyone trying this experiment who is not already
familiar with Unicode and the implications of using emoji in
email or other contexts".

If we want the experiment to tell us anything at all, we should
not set up those who are not as experienced and knowledgeable as
the authors and some of those reading this to go off and try
this, call on some handy libraries whose properties they don't
understand, and then be surprised when they get complaints from
users (or, in the worst cases, regulators) about displaying
obscene gestures to small children.  That is, I hope, a lurid
and exaggerated scenario, but you know as well as I do that the
application author or providers is always the one to blame and
that "well, I just used library xyz" is rarely an effective
defense".

Or, if we know all the answers already and the only purpose of
the document is document the extension to Content-Disposition,
then maybe those who have been suggesting that we not try to
nail down the body part content at all beyond a general
suggestion, e.g., change 

	emoji = emoji_sequence
	emoji_sequence = { defined in [Emoji-Seq] }

to 

 emoji = 
	; whatever you, your users, and the available libraries
	; think is appropriate to put in the content to reflect
  ; a response and that the sender thinks or hopes the 
  ; receiver will be able to display appropriately.

Yes, I see more space between those two options than that above
implies.  But the the point is that we either think it necessary
to define what is really allowed and have the expectation that
the definitions will be enforced and adhered by conforming
implementations (or the libraries they choose to call) or we
don't and are really just hand-waving.  If it is the latter,
let's say that and move on.

> As such, this isn't a valid way of assessing the experiment.
> The goal here isn't to address the many issues with Unicode
> emoji; it's to add a new capability to email. We believe that
> the best mechanism on which to base this new capability is
> Unicode emoji, but that's in large part due to there being no
> viable alternative.

Seems reasonable to me.  See above.

> I completely understand the desire to try and fix problems
> with Unicode in general and problems with Unicode emoji in
> particular. But after watching these attempts  play out over
> and over again, in specification after specification, for the
> past 30 years (yes, it has been that long), and seen the
> results of our attempts to address these issues, I've become
> convinced that this isn't even the right venue, let alone the
> right specification, for that.

And we completely agree about that.  See above.

> I also understand that when you don't see the changes you want
> happening on the Unicode side of things, you want to warn
> everyone about the possible pitfalls. 

Not exactly.  First, while I have tried to keep it out of this
discussion because I don't think it is relevant, I believe that
Unicode has made a sufficient botch of emoji and the associated
picture language that I don't think it could fixed, or even
usefully and significantly changed, even if they wanted to.
Unicode is, for better or worse, the only plausible option we
have today and I think the odds of a better one emerging are
slight to none.  I think we can only figure out what to do with
what is given, work around problems when necessary, and try to
avoid stepping unawares into alligator-infested swamps.  So,
yes, that leads me to warning about possible pitfalls, but it
isn't because there are changes I'd like to see Unicode make in
this area that are not happening.

> But we need to be realistic about what's possible here and
> what's not. For better or worse, anyone implementing this is
> almost certainly going to use some library or other to do all
> the Emoji handling. They'll use someone else's emoji picker to
> select them, they'll use someone else's Unicode display code to
> present them, and they will use someone else's regexp library
> to check them. 

Ok.  Fine (and we agree).  But then let's say that rather than
putting in a piece of syntax, saying "{ defined in [Emoji-Seq]
}" and pretending that is a normative specification when we know
the reality is going to be "defined by whatever library and
emoji picker the implementation chooses or has thrust upon it".

> Absent precise articulaion of things that can be done to
> address specific issues we know exist given such an
> implementation strategy, any text we add is unlikely to have
> any impact. And by the same token, in assesing the
> experiment's results the only thing it's sensible to measure
> is whether or not the specification got the specifics right.

I very nearly agree.  But then we should not pretend to be
specifying something we are not really specifying.  Otherwise,
we are, as you sort of put it elsewhere, specifying an
experiment to which we already know the answer: that almost no
one is going to follow or implement

	emoji = emoji_sequence
	emoji_sequence = { defined in [Emoji-Seq] }

except by dumb luck associated with the libraries and tools they
pick and that there are going to be problems because we know
that already.  But then let's not pretend to specify things as
exactly as the above, by itself, implies that we have done.

> All that said, we do need to know if the capability itself
> causes operational problems, e.g., lots of clients can't
> properly deal with the additional message parts. But this
> needs to be distinct from emoji/Unicode issues.

No disagreement about that either.

>> (2) Add an additional bullet, probably after the current third
>> one, reading something like:
> 
>> 	Does the use of emoji characters pose any special
>> 	challenges in processing or to users, including the
>> 	issues mentioned in Section NN above?  Do
>> 	implementations prefer to support only a limited set of
>> 	emoji (on either input or output) using a list similar
>> 	to the <base-emojis> set or are users permitted to
>> 	specify, and receivers expected to be able to render,
>> 	the full range of emoji specified by UTS#51?
> 
>> I think I (slightly) prefer the first, but the second smuggles
>> in an extra issue, one that others have implicitly raised in
>> the last 48 hours: what choices implementations make among
>> those implied by the paragraph starting "The rule base-emojis
>> MAY be used..." in Section 2.
> 
> I think it's approprate to ask for an assessment of how many
> implementations require subsetting, and if they do, what
> subsets they opted to use, and how. But beyond that we're back
> in "things we can't fix" territory.

Agreed.

best,
   john

-- 
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call