Re: [Json] Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-09

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 7 dec 2014, at 19:05, John Cowan <cowan@xxxxxxxxxxxxxxxx> wrote:
> 
> Patrik Fältström scripsit:
> 
>> But it also reference RFC7159, which doesn't require UTF-8 but instead
>> for some weird reason also allow other encodings of Unicode text. And
>> on top of that it says Byte Order Mark is not allowed.
> 
> 7159 was meant to tighten the wording of 4627, not to impose additional
> constraints on it.  For that, see the I-JSON draft.

The problem I have is that 7159 is not tight enough as it allows other encodings than UTF-8, which in turn make the encoding not work very well as this draft take for granted each one of the separator characters is one byte each.

I.e. the way I read draft-ietf-json-text-sequence (and I might be wrong), you have specific octet values that act as separators. That only works if the encoding is UTF-8.

See Figure 1:

> possible-JSON = 1*(not-RS); attempt to parse as UTF-8-encoded
>                                ; JSON text (see RFC7159)

Now, if this is NOT UTF-8, then this might be pretty bad situation.

What I am saying is that I would like this draft to explicitly say that the only profile of RFC7159 that can be used is when UTF-8 is in use, i.e. somewhere something like "The encoding MUST be UTF-8, although RFC7159 also allow other encodings, like UTF-16." Then in the security considerations section that "RFC7159 do allow not only UTF-8 encoding but also for example UTF-16, which MIGHT create problems for a parser, all depending on what data is serialized."

I.e. I want this draft to be even more tight than RFC7159.

Let me ask it this way: is there any reason to allow other encodings than UTF-8? If so, how do you handle the encoding of the separators?

>> This together implies that first of all this draft might not lead to
>> stable implementations, secondly one can not store in JSON strings
>> that include the Byte Order Mark, and there are other unspecified
>> situations.
> 
> If by that you mean that a JSON string may not contain U+FEFF, that is
> incorrect, for U+FEFF is recognized as a BOM only when placed at the
> beginning of an entity body, whereas an entity body in JSON format can
> begin only with [ or { classically, or by extension with [0-9"tfn].

Ok, so what you say is that a string in an attribute value in the JSON blob can still start with U+FEFF?

If so, good, and my apologies for not understanding this at my read of the text.

   Patrik

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail


[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Fedora Users]