RE: Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-09

"Black, David" <david.black@xxxxxxx> · Tue, 9 Dec 2014 18:49:35 +0000

> So I think we really do need to say something about top-level numbers
> (and true, false, and null), namely: that they must be delimited by
> whitespace, that '<RS>1234<RS>' is not a valid sequence element because
> the number may have been truncated.  (Ditto '<RS>true<RS>', since the
> intended text could have been 'trueish', which is invalid of course, but
> still.)

That would be more robust, as then all JSON texts in a sequence have
delimiters and absence of the closing delimiter clearly indicates
truncation.

Thanks,
--David

> -----Original Message-----
> From: Nico Williams [mailto:nico@xxxxxxxxxxxxxxxx]
> Sent: Tuesday, December 09, 2014 12:17 PM
> To: Black, David
> Cc: General Area Review Team (gen-art@xxxxxxxx); ops-dir@xxxxxxxx;
> ietf@xxxxxxxx; json@xxxxxxxx
> Subject: Re: Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-09
> 
> On Tue, Dec 09, 2014 at 04:41:12PM +0000, Black, David wrote:
> > [A] JSON text parse failures
> > > [...]
> >
> > Your alternative wording "whenever the JSON text parse fails, ..." is fine.
> 
> OK.
> 
> > [D] Truncation
> >
> > > A missing terminating LF is not a problem for strings, arrays, or
> > > objects.  I seem to recall that we did discuss this.  We could require
> > > that such texts fail to parse, but perhaps the more important thing is
> > > to require common parser behavior as to such truncations.
> > >
> > > You ABNF proposal is certainly more strict than the one in the I-D.  I'm
> > > neutral as to whether this form or the one in the I-D (with the ws issue
> > > fixed) is better.  The stricter form is clearly easier to talk about,
> > > therefore preferable, but it will mean discarding texts where only that
> > > terminating LF is truncated.
> >
> > I concur with both of the above paragraphs - my preference is to detect
> > incomplete JSON texts at the sequence level via the missing LF rather than
> > special-casing numbers and relying on failed JSON parses for everything
> else.
> > In general, earlier detection of errors increases the options for dealing
> > with them.
> 
> And, of course, a streaming/incremental parsers might well output all
> there is to output when only the last LF is missing but the top-level
> value was properly delimited anyways.  So it's kinda difficult to get a
> fool-proof requirement that the trailing LF must be present.
> 
> Your review comments included adding this note about incremental
> parsing.  There's a conflict here between the two comments that had not
> been apparent to me last night.  I now think that fixing the ws problem
> is the best way forward.
> 
> > Once the incomplete text is detected, a JSON parse could be attempted,
> > with the JSON parser knowing that the text is incomplete (e.g., text
> > may fail to parse, a number at the end of the text must not be produced
> > as an incremental parse result).
> 
> That's so for non-incremental parsers.  (Or when buffering the complete
> text instead of handling incrementally, even though one has an
> incremental parser.)
> 
> Consider one implementation I'm familiar with.  Its JSON text parser is
> incremental (but not streaming), so it produces outputs with no need for
> extra whitespace when the input text is a string, array, or object, but
> for top-level numbers, booleans, and null, it needs to either read one
> more byte or reach EOF before it will output them.
> 
> So I think we really do need to say something about top-level numbers
> (and true, false, and null), namely: that they must be delimited by
> whitespace, that '<RS>1234<RS>' is not a valid sequence element because
> the number may have been truncated.  (Ditto '<RS>true<RS>', since the
> intended text could have been 'trueish', which is invalid of course, but
> still.)
> 
> > As for RFC 20 ...
> >
> > > Is this resolved by now?  I can always reference only Unicode.
> >
> > Keep the RFC 20 reference - I have no problem with it.  Moreover, as a
> > result of all the hubbub around this nit, the IESG has issued a Last Call
> > to reclassify RFC 20 as an Internet Standard ... so that this never
> > arises again ...
> 
> Yes, I noticed.  I expect the IETF LC will pass for that.
> 
> Thanks,
> 
> Nico
> --