Re: [Last-Call] New Version Notification for draft-crocker-inreply-react-07.txt

Martin J. Dürst <duerst@xxxxxxxxxxxxxxx> · Thu, 4 Mar 2021 08:34:57 +0900

Hello John, others,

Sorry this reply is late; I didn't get around to it yesterday.

On 03/03/2021 05:01, John C Klensin wrote:

--On Tuesday, 02 March, 2021 15:48 +0900 "Martin J. Dürst"
<duerst@xxxxxxxxxxxxxxx> wrote:

...
Note:
      The "emoji" token looks simple. It isn't.
  Implementors are  well-advised not to assume that emoji
sequences are trivial to parse or  validate. Among other
concerns, an implementation of the Unicode  Character
Database is required. An emoji is more than a stand-in for a
simple alternation of characters. Similarly, one emoji
sequence is not  interchangeable with, or equivalent to,
another one, and comparisons  require detailed understanding
of the relevant Unicode mechanisms. Use  of an existing
Unicode implementation will typically prove extremely
helpful, as will an understanding of the error modes that may
arise from  a chosen implementation.

I think this is a valuable addition. I was following the
discussion for a long time, and wanted to point to the Unicode
implementations already existing out there and the high
probability that a mailer would use one of these, but Ned
finally pointed this out ahead of me.

Martin,

A question about the above.  When I first read the paragraph,
"implementation of the Unicode  Character Database" struck me as
strange, partially because I don't know what implementing a
database means.  I would have thought the better wording would
have been closer to "implementation of the rules of UTS#51 and
the associated tables".  Do you agree?  If you (or others who
are deeply steeped in Unicode) do not see a problem with the
existing text, I'm happy to just let it go.

"implementation of the Unicode Character Database" indeed sounds 
somewhat strange. A first expansion could be "implementation of an API 
that gives access to the character properties defined in the Unicode 
Character Database", which would probably sound a bit more natural.

Such an implementation can be as a direct API (similar to the old C 
stdlib isupper and friends), as part of a regular expression engine, or 
in some other, more targeted form (e.g. some API tailored to IDNs or 
so). Actually implementing it involves downloading and parsing the data 
Unicode provides, selecting the data needed, and converting it to a form 
that can be used by the target environment (e.g. programming language,...).

Hope this helps. Please feel free to ask more if necessary, but 
preferably in another venue.

Regards,   Martin.

--
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call