Re: [Json] secdir review of draft-ietf-jsonbis-rfc7159bis-03

Elwyn Davies <elwynd@xxxxxxxxxxxxxx> · Sun, 12 Mar 2017 23:07:40 +0000

(with half a Gen-art hat on...)

Does the WG really want to revisit the anguished discussions that resulted in the changes to Section 8.1 of draft-ietf-json-rfc4627bis between versions 07 and 08 back in late November 2013?

See https://www.ietf.org/mail-archive/web/json/current/msg02053.html and many, many messages beore this.

Cheers,
Elwyn

Sent from Samsung tablet.

-------- Original message --------
From: Peter Cordell <petejson@xxxxxxxxxxxxx> 
Date: 12/03/2017  09:06  (GMT+00:00) 
To: Ned Freed <ned.freed@xxxxxxxxxxx>, Julian Reschke <julian.reschke@xxxxxx> 
Cc: draft-ietf-jsonbis-rfc7159bis.all@xxxxxxxx, John Cowan <cowan@xxxxxxxx>, ietf@xxxxxxxx, secdir@xxxxxxxx, json@xxxxxxxx, Benjamin Kaduk <kaduk@xxxxxxx> 
Subject: Re: [Json] secdir review of draft-ietf-jsonbis-rfc7159bis-03 

On 11/03/2017 15:41, Ned Freed wrote:
>> On 2017-03-11 03:08, John Cowan wrote:
>> >
>> > On Thu, Mar 9, 2017 at 12:53 AM, Benjamin Kaduk <kaduk@xxxxxxx
>> > <mailto:kaduk@xxxxxxx>> wrote:
>> >
>> >     If that's what's supposed to happen, it should probably be more
>> >     clear, yes.  (But aren't there texts that have valid
>> interpretations
>> >     in multiple encodings?)
>> >
>> >
>> > Not if the content is well-formed JSON and the only possible encodings
>> > are UTF-8, UTF-16, and UTF-32.  It suffices to examine the first four
>> > bytes of the input.  If there are no NUL bytes in the first four bytes,
>> > it is UTF-8; if there are two NUL bytes, it is UTF-16; if there are
>> > three NUL bytes, it is UTF-32.  This works because the grammar requires
>> > the first character to be in the ASCII repertoire, and the NUL
>> > *character* (U+0000) is not allowed at all.
>
>> Good explanation. Maybe the spec should include it.
>
> +1
>
> This exact issue just came up in a media type review, where someone
> specified a charset parameter because they weren't aware of this algorithm.
>
> It would be very helpful to have this text in the RFC.

Although it does need slightly more detail to take into account 
endian-ness in the case of UTF-16 and -32.

The XML spec may offer some example text:

https://www.w3.org/TR/2008/REC-xml-20081126/#sec-guessing

Pete Cordell
Codalogic Ltd
Read & write XML in C++, http://www.xml2cpp.com