Re: [Json] secdir review of draft-ietf-jsonbis-rfc7159bis-03

Julian Reschke <julian.reschke@xxxxxx> · Sun, 12 Mar 2017 10:14:04 +0100

On 2017-03-12 10:06, Peter Cordell wrote:
...
This exact issue just came up in a media type review, where someone
specified a charset parameter because they weren't aware of this
algorithm.

It would be very helpful to have this text in the RFC.

Although it does need slightly more detail to take into account
endian-ness in the case of UTF-16 and -32.
...

Does anybody recall why we removed 
<https://tools.ietf.org/html/rfc4627#section-3>:

3.  Encoding

   JSON text SHALL be encoded in Unicode.  The default encoding is
   UTF-8.

   Since the first two characters of a JSON text will always be ASCII
   characters [RFC0020], it is possible to determine whether an octet
   stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
   at the pattern of nulls in the first four octets.

           00 00 00 xx  UTF-32BE
           00 xx 00 xx  UTF-16BE
           xx 00 00 00  UTF-32LE
           xx 00 xx 00  UTF-16LE
           xx xx xx xx  UTF-8

?

Best regards, Julian