Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-11

"Black, David" <david.black@xxxxxxx> · Thu, 11 Dec 2014 23:11:16 +0000

And the -11 version resolves everything else, plus picks up improved
versions of the editorial clarifications; -11 is Ready for RFC publication.

Many thanks for the timely responses.

Thanks,
--David

> -----Original Message-----
> From: Black, David
> Sent: Wednesday, December 10, 2014 10:51 AM
> To: nico@xxxxxxxxxxxxxxxx; General Area Review Team (gen-art@xxxxxxxx); ops-
> dir@xxxxxxxx
> Cc: ietf@xxxxxxxx; json@xxxxxxxx; Black, David
> Subject: Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-10
> 
> The -10 version of this draft resolves items [A]-[E] from the
> Gen-ART and OPS-Dir review of -09, and the IESG is in the process of
> resolving the (silly) idnits complaint about RFC 20 being a possible
> downref.
> 
> For item [D], a different approach was taken instead of modifying
> the ABNF - the resulting new Section 2.4 is a definite improvement
> to the draft, and is significantly clearer than the modified ABNF
> would have been.  Nicely done.
> 
> Item [F] about the <angle-bracketed> text in the IANA Considerations
> (Section 4) remains open - if the intent is to not deal with replacing
> that text until after IESG approval, an RFC Editor Note to that effect
> should be added to Section 4.
> 
> I have an additional editorial concern - given all the discussion about
> UTF-8, it would be good for the draft to make it clear early on
> that JSON text sequences are UTF-8 only.  Here are some suggested changes.
> 
> Abstract:
> 
>    This document describes the JSON text sequence format and associated
>    media type, "application/json-seq".  A JSON text sequence consists of
>    any number of JSON texts, each prefix by an Record Separator
>    (U+001E), and each ending with a newline character (U+000A).
> 
> "any number of JSON texts" -> "any number of UTF-8 encoded JSON texts"
> 
> It also looks like ASCII names for RS and LF are being mixed w/Unicode
> codepoints in the second sentence in the abstract.  I'm not sure that's
> a good thing to do, especially as the body of the draft refers to RS and
> LF as being ASCII.  Here are a couple of changes that would remedy this:
> 
>    "an Record Separator (U+001E)" -> "an ASCII Record Separator (0x1E)"
>    "a newline character (U+000A)" -> "an ASCII newline character (0x0A)"
> 
> Section 2 JSON Text Sequence Format:
> 
> I suggest adding this sentence as a separate paragraph at the end of this
> section (i.e., just before Section 2.1):
> 
>    JSON text sequences MUST use UTF-8 encoding; other encodings of JSON
>    (i.e., UTF-16 and UTF-32) MUST NOT be used.
> 
> Aside from item [F], all of the above are editorial suggestions, but I
> think making this clear early in the draft will help avoid potential
> implementer confusion.
> 
> Thanks,
> --David
> 
> > -----Original Message-----
> > From: Black, David
> > Sent: Friday, December 05, 2014 9:51 AM
> > To: nico@xxxxxxxxxxxxxxxx; General Area Review Team (gen-art@xxxxxxxx); ops-
> > dir@xxxxxxxx
> > Cc: ietf@xxxxxxxx; json@xxxxxxxx; Black, David
> > Subject: Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-09
> >
> > This is a combined Gen-ART and OPS-Dir review.  Boilerplate for both follows
> > ...
> >
> > I am the assigned Gen-ART reviewer for this draft. For background on
> > Gen-ART, please see the FAQ at:
> >
> > <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
> >
> > Please resolve these comments along with any other Last Call comments
> > you may receive.
> >
> > I have reviewed this document as part of the Operational directorate's
> ongoing
> > effort to review all IETF documents being processed by the IESG.  These
> > comments
> > were written primarily for the benefit of the operational area directors.
> > Document editors and WG chairs should treat these comments just like any
> other
> > last call comments.
> >
> > Document: draft-ietf-json-text-sequence-09
> > Reviewer: David Black
> > Review Date: Dec 5, 2014
> > IETF LC End Date: Dec 5, 2014
> > IESG Telechat date: Dec 18, 2014
> >
> > Summary: This draft is on the right track, but has open issues
> >  		described in the review.
> >
> > This draft specifies a format that packs multiple JSON texts into a
> > single string.  The ASCII RS (0x1E) character is used to separate texts,
> > and a linefeed is appended to each text to ensure that a complete text
> > always ends with a whitespace character.
> >
> > All of the open issues are minor - the most important ones center on
> > treatment of incomplete JSON texts - that appears to be an afterthought
> > in this draft and needs more attention.  I also found a couple of
> > minor issues in the Security and IANA Considerations sections, both of
> > which are almost nits.
> >
> > Major issues: None.
> >
> > Minor issues:
> >
> > [A] Section 2.1:
> >
> >    If parsing of such an octet string as a JSON text fails, the parser
> >    SHOULD nonetheless continue parsing the remainder of the sequence.
> >
> > That's not quite right - there are two levels of parsing, JSON
> > sequence parsing and JSON text parsing of each text in the sequence,
> > both of which might be implemented in a single-pass parser.  For such an
> > implementation, the above sentence could be (mis-)read to imply that the
> > JSON text parse should resume from the point at which it failed, which
> > would be silly (although I've seen heroic PL/1 parsers do exactly that).
> > Instead, the parse needs to skip ahead to the next RS, ignoring the rest
> > of the JSON text that failed to parse.  I suggest:
> >
> >    If parsing of such an octet string as a JSON text fails, and the
> >    octet string is followed by an RS octet, the parser
> >    SHOULD nonetheless skip ahead to that RS octet and continue parsing
> >    the remainder of the sequence from there.
> >
> > That also covers the case where there is nothing more to parse after the
> > JSON text that caused the parse failure.
> >
> > [B] Section 2.3:
> >
> > Is incremental parsing of a JSON text within a sequence allowed, or
> > is the parser required to not produce any results until the parse of
> > the entire text is successful?  I'd expect that incremental parsing
> > is ok (so results may be produced from a text that ultimately fails
> > to parse), and I think that's worth stating.
> >
> > [C] Section 2.4:
> >
> >    Parsers MUST check that any JSON texts that are a top-level number
> >    include JSON whitespace ("ws" ABNF rule from [RFC7159]) after the
> >    number, otherwise the JSON-text may have been truncated.
> >
> > That reference to the "ws" rule doesn't get the job done because that
> > rule allows a match to no characters - it's of the form ws = *( ... )
> > where ... is the list of whitespace characters.  What's needed here is
> > a rule of the form vws = 1*( ...) to force there to be at least one
> > whitespace character, but see the next issue for a better way to deal
> > with this topic by pulling the appended LF into the sequence parse
> > instead of the text parse.
> >
> > [D] I wonder whether the possibility of incomplete texts ought to be
> > encoded into the parsing rules to directly catch JSON texts that must
> > be incomplete because the last character is not LF, e.g.:
> >
> >      JSON-sequence = *(1*RS (possible-JSON / truncated-JSON / empty-JSON))
> >      RS = %x1E; "record separator" (RS), see RFC20
> >      possible-JSON = 1*(not-RS) LF ; attempt to parse as UTF-8-encoded
> >                                ; JSON text (see RFC7159)
> >      truncated-JSON = *(not-RS) not-LFRS); truncated, don't attempt
> > 					; to parse as JSON text
> >      empty-JSON = LF ; only the LF appended by the encoder, nothing to parse
> >
> >      not-RS = %x00-1D / %x1F-FF; any octet other than RS
> >      not-LFRS = %x00-09/ %x1B-1D / %x1F-FF; any octet other than RS or LF
> >
> > Note that this won't detect all incomplete JSON texts, because LF is allowed
> > within a JSON text (and this should be stated).
> >
> > [E] Section 3 - Security Considerations
> >
> > Incomplete and malformed JSON texts can be used to attack JSON parsers -
> > that should be pointed out, as I don't see that in RFC 7159's security
> > considerations and incomplete texts are a relevant consideration for
> > this draft.
> >
> > [F] Section 4 - IANA Considerations
> >
> >    Security considerations: See <this document, once published>,
> >    Section 3.
> >
> >    Interoperability considerations: Described herein.
> >
> >    Published specification: <this document, once published>.
> >
> >    Applications that use this media type: <by publication time
> >    <https://stedolan.github.io/jq> is likely to support this format>.
> >
> > Replace all three instances of the angle bracketed text.  The first two
> > instances should be RFC references (e.g., RFC XXXX) w/a note to the RFC
> > Editor to insert the number of the RFC when published.  The third instance
> > should be resolved now, or could have an RFC Editor note added indicating
> > that the author will resolve that during Authors 48 hours.
> >
> > Nits/editorial comments:
> >
> > idnits didn't like the reference to RFC 20 for ASCII:
> >
> >   ** Downref: Normative reference to an Unknown state RFC: RFC   20
> >
> > RFC 5234 (ABNF) uses this, which looks like a better reference:
> >
> >    [US-ASCII]  American National Standards Institute, "Coded Character
> >                Set -- 7-bit American Standard Code for Information
> >                Interchange", ANSI X3.4, 1986.
> >
> > --- Selected RFC 5706 Appendix A Q&A for OPS-Dir review ---
> >
> > Most of these questions are n/a because this draft describes a format
> > that will be used in other protocols to which RFC 5706's concerns would
> apply.
> >
> > A.1.4   Have the Requirements on other protocols and functional
> >        components been discussed?
> >
> > The specification of the interaction of the JSON sequence parser with the
> > JSON text parser is not as clear as it should be for incomplete or malformed
> > JSON texts.  See Minor Issues [A]-[E] above.
> >
> > A.1.8   Are there fault or threshold conditions that should be reported?
> >
> > Yes, incomplete JSON texts - this is covered in sections 2.3 and 2.4.
> >
> > Thanks,
> > --David
> > ----------------------------------------------------
> > David L. Black, Distinguished Engineer
> > EMC Corporation, 176 South St., Hopkinton, MA  01748
> > +1 (508) 293-7953             FAX: +1 (508) 293-7786
> > david.black@xxxxxxx        Mobile: +1 (978) 394-7754
> > ----------------------------------------------------