On Mon, Dec 08, 2014 at 11:20:25PM -0500, John Cowan wrote: > Patrik Fältström scripsit: > > This implies the whole thing is a UTF-8 encoded text that is to be > > parsed like this: > > No, this is a misunderstanding. There is no requirement that the sequence > *as a whole* is well-formed UTF-8 text. For example, if the first JSON > text is written only in part for whatever reason (system crash, etc.) to > a log file, the next process can write a 0x1E byte and carry on. Correct. But we're splitting hairs: arbitrary octet strings (but for 0x1E) can be attempted to be parsed, though only those that are valid JSON texts encoded in UTF-8 should be accepted. The encoder, of course, must produce UTF-8. Nico --