[Last-Call] Re: [art] Re: Artart telechat review of draft-ietf-jmap-contacts-09

Tim Bray <tbray@xxxxxxxxxxxxxx> · Mon, 20 May 2024 01:37:59 +0000

On May 19, 2024 at 6:26:17 PM, Dale R. Worley <worley@xxxxxxxxxxx> wrote:

But in this case, looking at RFC 4627 sec. 2.5, "Strings", it's clear
(though not directly stated) that a JSON string representation will be a
sequence of ASCII characters that represent a sequence of Unicode
characters.  So the limitation in this draft to "Unicode characters"
matches what the definition of JSON allows, and as such there is no
subsetting.

4627 has been obsoleted by the current operative specification of JSON, RFC8259 (disclosure: editor), from which:

char = unescaped /
          escape (
              %x22 /          ; "    quotation mark  U+0022
              %x5C /          ; \    reverse solidus U+005C
              %x2F /          ; /    solidus         U+002F
              %x62 /          ; b    backspace       U+0008
              %x66 /          ; f    form feed       U+000C
              %x6E /          ; n    line feed       U+000A
              %x72 /          ; r    carriage return U+000D
              %x74 /          ; t    tab             U+0009
              %x75 4HEXDIG )  ; uXXXX                U+XXXX

      escape = %x5C              ; \

      quotation-mark = %x22      ; "

      unescaped = %x20-21 / %x23-5B / %x5D-10FFFF

Note the values in “unescaped”.  Surrogates, including naked unpaired surrogates, are clearly allowed. Yes, that is damaging and dumb. It’s too late to change it, though, which is why I-JSON exists, see RFC7493 (disclosure: editor), from which:

   Object member names, and string values in arrays and object members,
   MUST NOT include code points that identify Surrogates or
   Noncharacters as defined by [UNICODE].

   This applies both to characters encoded directly in UTF-8 and to
   those which are escaped; thus, "\uDEAD" is invalid because it is an
   unpaired surrogate, while "\uD800\uDEAD" would be legal.

-- 
last-call mailing list -- last-call@xxxxxxxx
To unsubscribe send an email to last-call-leave@xxxxxxxx