>> 2) No support for tag compression. (I assume this was about map keys, not about tags.) > That's an interesting requirement, and one that I think could be added to > the design if there were others that felt motivated to help. I think I > can see a way that it could be added later: create a new tag that precedes > a map of string-to-int conversions. I'd probably do it the other way around: tagN([{1: "foo", 2: "bar"}, ...abbreviated data item...]) Where an abbreviated data item of the form [1, 2, 3, {1: "beer", 2: "wine", "baz": 1}, 5, 6] would then be interpreted as [1, 2, 3, {"foo": "beer", "bar": "wine", "baz": 1}, 5, 6] Yes, processing of this kind is easy to add as a tag. If the first parameter is instead a URI (preferably ni: scheme), it could save carrying around a large dictionary. > However, my intuition is that this wouldn't have radically better behavior > than gzip, and so I'd like to see some numbers to prove that the > complexity was worthwhile. I share that intuition. CBOR is intended to be useful also in those environments where running a full compression algorithm is impractical; here such a scheme could still have benefits. >> The first one is my main complaint. I want to be able to use the binary >> and text JSON encodings interchangeably and not have the upper layers to >> have to bother with it at all. (The applications I have in mind use media types, but:) > I think I understand this. I could see where my CBOR event-based parser > could also take JSON in, and generate the exact same events. I might even > do that as a proof of concept. Could you say more about what in CBOR you > think violates this? Well, if you don't have a media type, and don't know whether you'll get a JSON text or a CBOR data item, you may need to mechanically distinguish them. E.g., the following six characters can occur at the start of a JSON text. All are valid as start (or only) byte of a CBOR data item: Byte JSON meaning CBOR interpretation %x20 ; Space -1 %x09 ; Horizontal tab 9 %x0A ; Line feed or New line 10 %x0D ; Carriage return 13 %x5B ; [ left square bracket starts byte string %x7B ; { left curly bracket starts UTF-8 string (Well, for any valid JSON texts, heuristics might tell you the string data items a CBOR parser sees are unrealistically large.) If a CBOR application does require initial signature bytes for self-description purposes, I would suggest using something like 0xd8 0xf8 ...data item... which decodes as tag248(data item); we could define 248 as a no-op tag. (I'm still working on your other message -- lots of juicy input, thank you!) Grüße, Carsten