Re: [Json] secdir review of draft-ietf-jsonbis-rfc7159bis-03

John Cowan <cowan@xxxxxxxx> · Fri, 10 Mar 2017 21:08:35 -0500

On Thu, Mar 9, 2017 at 12:53 AM, Benjamin Kaduk <kaduk@xxxxxxx> wrote:

If that's what's supposed to happen, it should probably be more

clear, yes.  (But aren't there texts that have valid interpretations

in multiple encodings?)

Not if the content is well-formed JSON and the only possible encodings are UTF-8, UTF-16, and UTF-32.  It suffices to examine the first four bytes of the input.  If there are no NUL bytes in the first four bytes, it is UTF-8; if there are two NUL bytes, it is UTF-16; if there are three NUL bytes, it is UTF-32.  This works because the grammar requires the first character to be in the ASCII repertoire, and the NUL *character* (U+0000) is not allowed at all.

-- 
John Cowan          http://vrici.lojban.org/~cowan        cowan@xxxxxxxx
I don't know half of you half as well as I should like, and I like less
than half of you half as well as you deserve.  --Bilbo