Re: [Last-Call] New Version Notification for draft-crocker-inreply-react-07.txt

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello John, others,

Sorry this reply is late; I didn't get around to it yesterday.

On 03/03/2021 05:01, John C Klensin wrote:


--On Tuesday, 02 March, 2021 15:48 +0900 "Martin J. Dürst"
<duerst@xxxxxxxxxxxxxxx> wrote:

...
Note:
      The "emoji" token looks simple. It isn't.
  Implementors are  well-advised not to assume that emoji
sequences are trivial to parse or  validate. Among other
concerns, an implementation of the Unicode  Character
Database is required. An emoji is more than a stand-in for a
simple alternation of characters. Similarly, one emoji
sequence is not  interchangeable with, or equivalent to,
another one, and comparisons  require detailed understanding
of the relevant Unicode mechanisms. Use  of an existing
Unicode implementation will typically prove extremely
helpful, as will an understanding of the error modes that may
arise from  a chosen implementation.

I think this is a valuable addition. I was following the
discussion for a long time, and wanted to point to the Unicode
implementations already existing out there and the high
probability that a mailer would use one of these, but Ned
finally pointed this out ahead of me.

Martin,

A question about the above.  When I first read the paragraph,
"implementation of the Unicode  Character Database" struck me as
strange, partially because I don't know what implementing a
database means.  I would have thought the better wording would
have been closer to "implementation of the rules of UTS#51 and
the associated tables".  Do you agree?  If you (or others who
are deeply steeped in Unicode) do not see a problem with the
existing text, I'm happy to just let it go.

"implementation of the Unicode Character Database" indeed sounds somewhat strange. A first expansion could be "implementation of an API that gives access to the character properties defined in the Unicode Character Database", which would probably sound a bit more natural.

Such an implementation can be as a direct API (similar to the old C stdlib isupper and friends), as part of a regular expression engine, or in some other, more targeted form (e.g. some API tailored to IDNs or so). Actually implementing it involves downloading and parsing the data Unicode provides, selecting the data needed, and converting it to a form that can be used by the target environment (e.g. programming language,...).

Hope this helps. Please feel free to ask more if necessary, but preferably in another venue.

Regards,   Martin.

--
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call




[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Mhonarc]     [Fedora Users]

  Powered by Linux