On Mon, Dec 02, 2013 at 04:30:09PM -0500, Phillip Hallam-Baker wrote: > Since we are talking about a serialization format, the distinction between > unordered sets and lists cannot occur at the wire level and this is where > we need interoperation. But it can be part of the on-the-wire description. See below. > One of the things I think we have learned from JSON is that a > self-describing format only needs to specify the abstract type of the datum > and not the representation. Self-description is a continuum. Some ASN.1 encoding rules can encode quite a bit of a schema on the wire -- clearly there's a point at which the resulting redundancy causes problems. But it's also true that having a large subset of the schema in the serialization can be useful (e.g., for generic "dump" tools). Given the prevalence of languages like Python, a "set" type will no doubt seem useful to some! Heck, the ability to use non-string values as keys (names) for objects would be nice too -- anyone who's spent much time with Python and JSON has wished for these things. JSON alone is insufficiently expressive for "pickling" Python values; JSON with a fair bit of convention layered on gets closer to being good enough for pickling Python values. Context will affect how much of the schema you or I will find desirable to see appear redundantly on the wire. But mostly I agree with you: "datetime" and such are interpretations of more basic datatypes, and so they belong in pre-agreed/documented schema rather than on the wire. Indeed, datetime/timestamp could be either strings or numeric, and still be understood correctly in context. > There are many data encodings on offer but I would like to be able to write > one decoder that can consume a data stream that contains basic JSON data > and JSON with extensions. This makes negotiating an encoding in a Web > service easy, the consumer states which encodings are acceptable and the > sender makes sure what is sent is compatible, downgrading the encoding to > the level accepted if necessary. Yes. Though, really, the main thing that's missing is chunked (indefinite-length) unescaped binary data. That's the thing that's difficult/expenside to deal with in JSON (or XML, for that matter). Bulk data transfers do matter. (And if one is going to add that to any serialization, then plain binary-coded integers and IEEE754 doubles may be much better, perf-wise, than decimal encodings.) Nico --