Re: [Json] JSON: remove gap between Ecma-404 and IETF draft

Joe Hildebrand <hildjj@xxxxxxxxxxx> · Wed, 13 Nov 2013 14:36:37 -0700

On 11/13/13 2:27 PM, "Paul Hoffman" <paul.hoffman@xxxxxxxx> wrote:

><no hat>
>
>On Nov 13, 2013, at 12:24 PM, Joe Hildebrand (jhildebr)
><jhildebr@xxxxxxxxx> wrote:
>
>> We would also need to change section 8.1 according to the mechanism that
>> was previously proposed:
>> 
>> 00 00 00 xx  UTF-32BE
>>    00 xx ?? xx  UTF-16BE
>>    xx 00 00 00  UTF-32LE
>>    xx 00 xx ?? UTF-16LE
>>    xx xx ?? ?? UTF-8
>> 
>> in order to account for strings at the top level whose first character
>>has
>> a codepoint greater than 127.
>
>A string at the top level of a JSON text still needs to start with an
>ASCII " character, so the logic is still fine, I believe.

Without top level strings, the first *two* characters of any JSON text are
always ASCII.  This:

"?"  (that's U+0022 U+0100 U+0022)

would encode the first two characters in UTF-16BE as:

00 22 01 00

8.1 currently says:

00 00 00 xx UTF-32BE
00 xx 00 xx UTF-16BE
xx 00 00 00 UTF-32LE
xx 00 xx 00 UTF-16LE
xx xx xx xx UTF-8

So the JSON text above would not match any of the table entries, causing
an error.

-- 
Joe Hildebrand