Ned, At this point, I think we are suffering from a difference in perspective rather than one the is more substantive. To start, I am convinced that the IETF could not fix any fundamental issues with Unicode, including addressing "the many issues with Unicode emoji", even if we had the expertise and energy to do so -- and it is clear to me that we have neither -- so I am not proposing to do that, in this document or anywhere else. At the same time, I see a tremendous amount of ignorance about those issues and even about Unicode fundamentals around the IETF (perhaps the other side of the "expertise and energy" coin) and parts of the surrounding implementer community and do not see it as desirable for a specification published in 2021 to encourage that ignorance. If I believed that the likely readers of this specification were reasonably familiar with the issues, we would not be having this discussion: a more or less general pointer in the direction of Unicode and UTS#51 would be fine, just as general pointers about ASCII have been for decades. If the terminology was not quite right, everyone would know what was intended and we'd just move on. However, from my point of view, the ignorance level is high enough that the boundary between imprecise terminology or a general pointer in the direction of Unicode and/or UTS#51 and a trap for the unwary is rather thin. Some specifics below. --On Wednesday, 03 March, 2021 07:02 -0800 Ned Freed <ned.freed@xxxxxxxxxxx> wrote: >... >> On the other hand, the IETF has usually tried to avoid getting >> sucked into, or even wanting to hear complaints about, user >> interface issues for reasons I vividly remember Dave >> explaining forcefully to me and several others many year ago. >> Because of the semantics associated with emoji and their >> binding to graphemes -- issues that do not arise with what we >> normally consider "text", even text in logographic scripts -- >> this specification comes very close to that territory. I >> still think we should move ahead with it, but believe that >> being clear that we are interested in those issue would be >> useful (I'm not going to try to make a case for "critical"). >> However, I don't think it requires even an extra bullet; see >> the first suggestion below. > >> Two possible suggestions: > >> (1) Change the third bullet to read: > >> Does the presence of the Reaction capability create any >> operational problems, including problems associated with >> the handling of emoji characters, for message >> originators or recipients? > > I can already answer this one: Yes, of course it will create > problems. The use of inline emoji in regular email subject > lines and text is creating problems as I write this; it's > nonsensical to think that moving those characters to a > different part with a slightly different label will change > this in any way. Indeed, if this proves popular it will likely > cause more problems, not less. Then drop the text I suggested for Section 7 and add text to Section 2 that is a different phrasing of what you wrote above, i.e., "We know this will cause problems in many implementation because emoji have been problematic in other email-related contexts for some time; implementers and those who support products or other software that implements this extension should be aware of those issues.". If you could provide a few (entirely informative) references to supplement that, so much the better. Or, if you prefer, the text might be "we advise against anyone trying this experiment who is not already familiar with Unicode and the implications of using emoji in email or other contexts". If we want the experiment to tell us anything at all, we should not set up those who are not as experienced and knowledgeable as the authors and some of those reading this to go off and try this, call on some handy libraries whose properties they don't understand, and then be surprised when they get complaints from users (or, in the worst cases, regulators) about displaying obscene gestures to small children. That is, I hope, a lurid and exaggerated scenario, but you know as well as I do that the application author or providers is always the one to blame and that "well, I just used library xyz" is rarely an effective defense". Or, if we know all the answers already and the only purpose of the document is document the extension to Content-Disposition, then maybe those who have been suggesting that we not try to nail down the body part content at all beyond a general suggestion, e.g., change emoji = emoji_sequence emoji_sequence = { defined in [Emoji-Seq] } to emoji = ; whatever you, your users, and the available libraries ; think is appropriate to put in the content to reflect ; a response and that the sender thinks or hopes the ; receiver will be able to display appropriately. Yes, I see more space between those two options than that above implies. But the the point is that we either think it necessary to define what is really allowed and have the expectation that the definitions will be enforced and adhered by conforming implementations (or the libraries they choose to call) or we don't and are really just hand-waving. If it is the latter, let's say that and move on. > As such, this isn't a valid way of assessing the experiment. > The goal here isn't to address the many issues with Unicode > emoji; it's to add a new capability to email. We believe that > the best mechanism on which to base this new capability is > Unicode emoji, but that's in large part due to there being no > viable alternative. Seems reasonable to me. See above. > I completely understand the desire to try and fix problems > with Unicode in general and problems with Unicode emoji in > particular. But after watching these attempts play out over > and over again, in specification after specification, for the > past 30 years (yes, it has been that long), and seen the > results of our attempts to address these issues, I've become > convinced that this isn't even the right venue, let alone the > right specification, for that. And we completely agree about that. See above. > I also understand that when you don't see the changes you want > happening on the Unicode side of things, you want to warn > everyone about the possible pitfalls. Not exactly. First, while I have tried to keep it out of this discussion because I don't think it is relevant, I believe that Unicode has made a sufficient botch of emoji and the associated picture language that I don't think it could fixed, or even usefully and significantly changed, even if they wanted to. Unicode is, for better or worse, the only plausible option we have today and I think the odds of a better one emerging are slight to none. I think we can only figure out what to do with what is given, work around problems when necessary, and try to avoid stepping unawares into alligator-infested swamps. So, yes, that leads me to warning about possible pitfalls, but it isn't because there are changes I'd like to see Unicode make in this area that are not happening. > But we need to be realistic about what's possible here and > what's not. For better or worse, anyone implementing this is > almost certainly going to use some library or other to do all > the Emoji handling. They'll use someone else's emoji picker to > select them, they'll use someone else's Unicode display code to > present them, and they will use someone else's regexp library > to check them. Ok. Fine (and we agree). But then let's say that rather than putting in a piece of syntax, saying "{ defined in [Emoji-Seq] }" and pretending that is a normative specification when we know the reality is going to be "defined by whatever library and emoji picker the implementation chooses or has thrust upon it". > Absent precise articulaion of things that can be done to > address specific issues we know exist given such an > implementation strategy, any text we add is unlikely to have > any impact. And by the same token, in assesing the > experiment's results the only thing it's sensible to measure > is whether or not the specification got the specifics right. I very nearly agree. But then we should not pretend to be specifying something we are not really specifying. Otherwise, we are, as you sort of put it elsewhere, specifying an experiment to which we already know the answer: that almost no one is going to follow or implement emoji = emoji_sequence emoji_sequence = { defined in [Emoji-Seq] } except by dumb luck associated with the libraries and tools they pick and that there are going to be problems because we know that already. But then let's not pretend to specify things as exactly as the above, by itself, implies that we have done. > All that said, we do need to know if the capability itself > causes operational problems, e.g., lots of clients can't > properly deal with the additional message parts. But this > needs to be distinct from emoji/Unicode issues. No disagreement about that either. >> (2) Add an additional bullet, probably after the current third >> one, reading something like: > >> Does the use of emoji characters pose any special >> challenges in processing or to users, including the >> issues mentioned in Section NN above? Do >> implementations prefer to support only a limited set of >> emoji (on either input or output) using a list similar >> to the <base-emojis> set or are users permitted to >> specify, and receivers expected to be able to render, >> the full range of emoji specified by UTS#51? > >> I think I (slightly) prefer the first, but the second smuggles >> in an extra issue, one that others have implicitly raised in >> the last 48 hours: what choices implementations make among >> those implied by the paragraph starting "The rule base-emojis >> MAY be used..." in Section 2. > > I think it's approprate to ask for an assessment of how many > implementations require subsetting, and if they do, what > subsets they opted to use, and how. But beyond that we're back > in "things we can't fix" territory. Agreed. best, john -- last-call mailing list last-call@xxxxxxxx https://www.ietf.org/mailman/listinfo/last-call