Hi Barry, On Tue, Mar 07, 2017 at 09:52:37PM -0500, Barry Leiba wrote: > Hi, Ben. > A note on the Internationalization points: > > > I'm also concerned about the freewheeling use of Unicode. While > > this document does discuss the potential encodings and lists UTF-8 > > as the default (and most interoperable), I think it would benefit > > from a stricter warning that parties using JSON for communication > > must have some out-of-band way to agree on what encoding is to be > > used. I would expect that this is usually going to be done by the > > protocol using JSON, but could see a place for the actual > > communicating peers to have out-of-band knowledge. (An application > > having to guess what encoding is being used based on heuristics is a > > recipe for disaster.) > > > > Additionally, the document makes no mention of Unicode > > normalization, which can be a minefield. The precis working group > > has a lot of work in this area, from which the executive summary is: > > it's a lot of work to do things correctly, and being sloppy usually > > leads to vulnerabilities. The most obvious issue would be in (the > > comparison of) field names using strings that can be represented > > differently in different normalization forms (for example, e with > > acute accent), which can be either U+00e9 or U+0064 and the > > combining character U+0301. Simply converting to Unicode code > > points is insufficient for an implementation to cause those strings > > to compare as equivalent. I think this document should at least > > mention that Unicode normalization forms exist and should be > > considered by protocol designers when using JSON with characters > > outside of US-ASCII. > > I believe that all of this is the realm of the protocol *using* JSON, > and doesn't belong in the JSON spec itself. The JSON spec makes it > clear what the encoding options are, and leaves things such as the set > of allowed characters (and any restrictions on them), the > normalization and canonicalization, and the comparison rules to the > next level... and I believe that's how it should be. Different uses > of JSON will have different needs in these regards, and *those* > specifications are the right places to say that. I agree that it is appopriate for the JSON spec to merely list out the options and leave decisions to the consuming applications/protocols. However, it seems irresponsible to not mention that those designing such protocols should be aware of the potential issues. -Ben