RE: Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-09

"Black, David" <david.black@xxxxxxx> · Wed, 10 Dec 2014 02:52:34 +0000

This looks good, and the explanation of what's going on (numbers,
'true', 'false' and 'null' lack delimiters) is a useful addition.

One small nit:

	'<RS>truefales<RS>' is not two top-level values, 'true',
	and 'false'; it is simply not a valid JSON text.

truefales -> truefalse

Thanks,
--David

> On Tue, Dec 09, 2014 at 06:49:35PM +0000, Black, David wrote:
> > > So I think we really do need to say something about top-level numbers
> > > (and true, false, and null), namely: that they must be delimited by
> > > whitespace, that '<RS>1234<RS>' is not a valid sequence element because
> > > the number may have been truncated.  (Ditto '<RS>true<RS>', since the
> > > intended text could have been 'trueish', which is invalid of course, but
> > > still.)
> >
> > That would be more robust, as then all JSON texts in a sequence have
> > delimiters and absence of the closing delimiter clearly indicates
> > truncation.
> 
> OK.
> 
> New section 2.4 text:
> 
>    While objects, arrays, and strings are self-delimited in JSON texts,
>    numbers, and the values 'true', 'false', and 'null' are not.  Only
>    whitespace can delimit the latter four kinds of values.
> 
>    Parsers MUST check that any JSON texts that are a top-level number,
>    or which might be 'true', 'false', or 'null' include JSON whitespace
>    (at least one byte matching the "ws" ABNF rule from RFC7159) after
>    that value, otherwise the JSON-text may have been truncated.  Note
>    that the LF following each JSON text matches the "ws" ABNF rule.
> 
>    Parsers MUST drop JSON-text sequence elements consisting of
>    non-self-delimited top-level values that may have been truncated
>    (that are not delimited by whitespace).  Parsers can report such
>    texts as warnings (including, optionally, the parsed text and/or the
>    original octet string).
> 
>    For example, '<RS>123<RS>' might have been intended to carry the
>    top-level number 123.4, but must have been truncated.  Similarly,
>    '<RS>true<RS>' might have been intended to carry the invalid text
>    'trueish'.  '<RS>truefales<RS>' is not two top-level values, 'true',
>    and 'false'; it is simply not a valid JSON text.
> 
> This is the only place where the ws rule comes up, so merely saying "at
> least one byte matching" it should suffice.
> 
> I'm also adding this following the above, based on your comment about
> incremental parsers:
> 
>    Implementations may produce a value when parsing '<RS>"foo"<RS>'
>    because their JSON text parser might be able to consume bytes
>    incrementally, and since the JSON text in this case is a
>    self-delimiting top-level value, the parser can produce the result
>    without consuming an additional byte.  Such implementations should
>    skip to the next RS byte, possibly reporting any intervening
>    non-whitespace bytes.
> 
> (yes, I think this should be a 'should', not a 'SHOULD').
> 
> Nico
> --