Hi, Ben. A note on the Internationalization points: > I'm also concerned about the freewheeling use of Unicode. While > this document does discuss the potential encodings and lists UTF-8 > as the default (and most interoperable), I think it would benefit > from a stricter warning that parties using JSON for communication > must have some out-of-band way to agree on what encoding is to be > used. I would expect that this is usually going to be done by the > protocol using JSON, but could see a place for the actual > communicating peers to have out-of-band knowledge. (An application > having to guess what encoding is being used based on heuristics is a > recipe for disaster.) > > Additionally, the document makes no mention of Unicode > normalization, which can be a minefield. The precis working group > has a lot of work in this area, from which the executive summary is: > it's a lot of work to do things correctly, and being sloppy usually > leads to vulnerabilities. The most obvious issue would be in (the > comparison of) field names using strings that can be represented > differently in different normalization forms (for example, e with > acute accent), which can be either U+00e9 or U+0064 and the > combining character U+0301. Simply converting to Unicode code > points is insufficient for an implementation to cause those strings > to compare as equivalent. I think this document should at least > mention that Unicode normalization forms exist and should be > considered by protocol designers when using JSON with characters > outside of US-ASCII. I believe that all of this is the realm of the protocol *using* JSON, and doesn't belong in the JSON spec itself. The JSON spec makes it clear what the encoding options are, and leaves things such as the set of allowed characters (and any restrictions on them), the normalization and canonicalization, and the comparison rules to the next level... and I believe that's how it should be. Different uses of JSON will have different needs in these regards, and *those* specifications are the right places to say that. Barry