Dear authors,
sorry I'm submitting these comments after the end of the LC period. I
hope they can still be of use.
- The document is well written and very clearly explained.
- I am still of the opinion that this document should better be
published as Experimental RFC. Unlike TCP and UDP. But the comments
below are unrelated to this discussion.
- The "diagnostic notation" can be used very effectively for things like
configuration files, e.g. if an application already uses CBOR on the
wire. Therefore I would suggest to formalize it a bit more, so that we
also have interoperability at this level.
- And since this notation is not meant as a JSON extension, this is a
good time to introduce comments (e.g. with an initial '#') into the
notation.
- The positive vs. negative encoding means that the parser actually
deals with 9-, 17-, 33- and 65-bit integers. I don't think this makes it
easier to write parsers.
- Arrays are prefixed by the number of elements but not by their length
in bytes. And elements need not be all of the same size. So you cannot
skip the array without fully parsing every last element. IIRC this is a
major disadvantage compared to ASN.1 encodings.
- A puzzling change from JSON, and one that probably complicates
implementations quite a bit, is that a map's index can be of any type,
not just a string. And this includes mixed index types for the same map.
- And similarly to arrays, you cannot skip a map element without deep
parsing of the element.
- I think many of the tag values are too specific, and are best left to
applications. For example, why should the format care if the app encodes
a UTF-8 string in base64? OTOH, I would reserve a part of the tag space
for "private" application-specific allocations.
- One tag value you may want to consider adding is "critical" in the
security sense of the word, i.e., an application is required to fail if
it does not understand the value (probably best applied to map keys).
- In the "diagnostic notation", I suggest to use symbolic values rather
than integers for tags, e.g. TAG_URI.
- Sec. 3: because of the need for deep parsing mentioned above, a wire
protocol cannot be extended by adding an element that uses a new data
type (e.g. double precision FP) unless all potential recipients
understand the type, even though they might not need to use the data
element.
- Type restrictions for tags should be spelled out more clearly. E.g. in
2.4.4.2, please clarify that when this tag applies to an array or map,
*all* the items (and potentially items of nested arrays/maps?) MUST be
byte strings. IMHO this just adds complexity and it's best to only tag
the atomic item.
- Text such as this (for unknown simple types): "might issue a warning,
might stop processing altogether, might handle the error by making the
unknown value available to the application as such, or take some other
type of action." is a security disaster waiting to happen. Also, it does
not allow extensibility. Even though the encoding format is nominally
extensible, in reality you cannot add stuff because the behavior of
existing implementations when faced with it is unpredictable.
- Similarly for unknown tags (which IMHO should be ignored). Note that
"unknown" includes currently specified tags, because implementations are
not required to implement all current tags.
- Another security issue, for incomplete arrays: "a parser may
completely fail the decoding, or substitute the missing data and data
items using an decoder-specific convention. " This is a buffer overflow
vulnerability by a different name.
- And by the way the entire Sec. 3 is non-normative. I suggest to use
normative language for parser behavior, to ensure it is deterministic.
Thanks,
Yaron