RE: [Json] Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-10

"Black, David" <david.black@xxxxxxx> · Thu, 18 Dec 2014 14:09:09 +0000



Yes, -11 is fine wrt my review: http://www.ietf.org/mail-archive/web/gen-art/current/msg11048.html

Thanks,
--David

> -----Original Message-----
> From: Jari Arkko [mailto:jari.arkko@xxxxxxxxx]
> Sent: Thursday, December 18, 2014 8:45 AM
> To: Patrik Fältström; Black, David
> Cc: John Cowan; ops-dir@xxxxxxxx; ietf@xxxxxxxx; Paul Hoffman; Manger, James;
> General Area Review Team (gen-art@xxxxxxxx)
> Subject: Re: [Json] Gen-ART and OPS-Dir review of draft-ietf-json-text-
> sequence-10
> 
> David  -thank you for the review!
> 
> My understanding of this thread and the -11 is that we are done with respect
> to the modifications coming out of your review. Let me know otherwise.
> 
> Thanks, all.
> 
> Jari
> 
> On 13 Dec 2014, at 02:02, Patrik Fältström <paf@xxxxxxxxxx> wrote:
> 
> >
> >> On 12 dec 2014, at 02:12, John Cowan <cowan@xxxxxxxxxxxxxxxx> wrote:
> >>
> >> Manger, James scripsit:
> >>
> >>> How about:
> >>>
> >>> "A JSON text sequence consists of any number of JSON texts,
> >>>  each prefixed by a Record Separator (U+001E) character, and
> >>>  each suffixed by an End of Line (U+000A) character. It is
> >>>  UTF-8 encoded."
> >>>
> >>> Say "Information Separator Two (U+001E)" if you really want to be pure.
> >>
> >> The trouble with that is that U+001E has no official Unicode name or
> >> function; those come from ISO 6429, which is incorporated (in relevant
> >> part) into US-ASCII, which is described in RFC 20.
> >
> > Although it does not have a Unicode Name, the alias is as close as we can
> get, which is "INFORMATION SEPARATOR TWO":
> >
> > # grep ^001E UnicodeData.txt
> > 001E;<control>;Cc;0;B;;;;;N;INFORMATION SEPARATOR TWO;;;;
> > #
> >
> > So I suggest to use that.
> >
> > It is I think wrong to say "Record Separator" and then still reference the
> Unicode Tables.
> >
> > Alternatively one just write (and make it more clear how this works, and
> this is my understanding):
> >
> >> A JSON text sequence consists of any number of JSON texts, each prefixed by
> U+001E character and each suffixed by U+000A. The JSON texts as well as the
> whole JSON text sequence is encoded in UTF-8 although any JSON text might be
> truncated and because of that not a valid UTF-8 sequence. Any occurance of the
> UTF-8 encoding of U+001E (the byte 0x1E) is to be viewed as the first byte
> before each JSON text, and occurrance of the byte 0x0A is to be viewed as the
> first byte after a complete JSON text. If the JSON text is truncated, the 0x0A
> byte will not be present.
> >
> > I.e. the grammar is sort of (before coffee in the morning):
> >
> > sequence := 0x1E text
> >
> > text := complete-text | truncated-text
> >
> > complete-text := proper-UTF8 0x0A
> >
> > truncated-text := proper-UTF8 broken-UTF8
> >
> > proper-UTF8 := "" | "a sequence of bytes, possible to parse as a series of
> UTF8 encoded Unicode characters"
> >
> > broken-UTF8 := "a sequence of bytes not possible to parse as a UTF8 encoded
> unicode character"
> >
> >   Patrik
> >