Reviewer: Robert Sparks Review result: Not Ready I am the assigned Gen-ART reviewer for this draft. The General Area Review Team (Gen-ART) reviews all IETF documents being processed by the IESG for the IETF Chair. Please treat these comments just like any other last call comments. For more information, please see the FAQ at <https://trac.ietf.org/trac/gen/wiki/GenArtfaq>. Document: draft-ietf-cellar-ebml-09 Reviewer: Robert Sparks Review Date: 2019-02-20 IETF LC End Date: None IESG Telechat date: Not scheduled for a telechat Summary: Not ready for publication as a Proposed Standard Note that this is really an early review - IETF LC has not yet been issued for this draft. Issues: It's not clear that this group is chartered to produce a general purpose binary equivalent to XML. Instead, it appears to be chartered to document FFV1 and Matroska. EBML as it is currently used for those things needs to be documented, but rather than try to make it into a format that other things besides the work of this group appears out of scope. If I'm correct, then this document shouldn't need to create an IANA registry - it need only document what the group needs (and if the group needs more later, it can update this document). The abstract and introduction would need to be adjusted to scope the purpose of the format to supporting the work of this group. My review assumes a scope of "documenting these existing formats" rather than providing a general purpose markup language. If I'm wrong and this group is chartered to produce an alternative for other protocols to use, this needs review from people who are more expert in that kind of representational design than me. The document is very hard to absorb given its current structure. It requires multi-pass reading to figure out. While I like the idea of leaping directly into security considerations, the draft really needs an introduction that lays out the high-level structural concepts more linearly. A diagram of the inherent tree structure would help reinforce the definitions. Showing the defined types (13.1.5.9) in that diagram would be useful. A shorter introduction to VINT (not the details of its structure, but why you have such a construct in the first place) would also help. Also consider examples showing the end of Unknown-Sized elements is determined (The description in the first partial paragraph at the top of page 12 is complicated). It would also help to introduce the idea of null terminated strings here to smooth the forward reference from section 8.4 to section 9. CRC-32 elements aren't defined until towards the end of the document, but non-trivial things are done with them throughout. Having those introduced, at least conceptually, much earlier would be good. The third paragraph of the Security Considerations section (including the bullets) appears to say its ok to treat the following invalid things as valid. If that's the case, why are they invalid in the first place? I think for some of them, perhaps text has drifted and they are no longer invalid, but just to be avoided when it is reasonable to do so? For those that really are invalid, some text should be added describing _why_ they are ok to accept. In 5.4 you claim the Size column in the first table refers to the size of the "VINT_DATA" in bits. The values there are not how many bits of VINT_DATA you have. If that's what you really want to show, the values should be 7, 14, etc. rather that 2^7, 2^14. If you mean for the column to show how large an integer can be represented by a VINT with this octet length, the values need to be 2^7-1, 2^14-1, etc. It would be helpful to reduce the width of the first columns so that the last value in the Representation column does not wrap. The last sentence of the 2nd paragraph of section 7 does not parse. Please simplify it. There is odd notation at the end of page 12. What does 2^(2_7) mean? At section 8, the second sentence scans very badly. Consider breaking it into two or more sentences so that there is no ambiguity around the concept being defined. As written, its too easy to misunderstand that the element type defines a concept that describes definition. At section 11, you say that EBML Documents MUST NOT contain any data that is not part of an EBML Element, but I think other sections allow that to happen in data rewrite situations? Why is the unique and persistent requirement SHOULD and not MUST at the top of page 20? At the definition of name (13.1.5.1), please talk about why this particular set of characters were chosen. Please expand on "the risk of false positives" in 13.1.5.3. The risk here is not obvious from the current text. At 13.1.5.6.1 (a section nesting 5 deep is a document structure warning sign by itself), in the first bullet I suggest you say "are of the type of the element" instead of "are integers or floats". What you have now lets me put an integer on one side and a float on the other. The range representations in 13.1.5.6.1 don't allow representing 0<=x<1, 0<x<1, or any other range with one or more open ends. Why isn't that problematic? The element descriptions in section 13 have range values that don't conform to the range represnations in 13.1.5.6.1. See 13.2.5 for example (where the range is "not 0"). The last sentence of section 13.1.12 looks like a security problem. Why would you ever use _any_ elements containing a CRC-32 element that's bad? Some discussion near the top of the document about what these CRC-32 elements are supposed to accomplish seems necessary. More needs to be said about the similar security issues at section 14. At 13.2.9, what does "read upper values" mean? Nits: The definition of Master Element is not a definition - it describes one propery Master Elements have. AFAICT, the concept isn't concisely defined anywhere in the document. The definition of Void Element doesn't need to say "damaged data". It coule be perfectly good data that you are overwriting. I suggest removing the word "official" from the Element Name definition. The concept of an Identically Recurring Element is not clear throughout the document. Some section should give a complete definition, and the places that call out whether exceptions (such as where the security considerations discusses Identically Recurring Elements with invalid CRC-32 elements) are still Identically Recurring Elements or not. What happens if the values in the invalid CRC-32 elements are different from each other? At section 8.7, second paragraph, first sentence. The sentence is long, and "equals to" looks like a problem spot for non-native-English readers. Perhaps "set to the same value as"? Micronits: Page 9: s/four octet./four octets./ Expanding the numbers in section 8.1 and 8.2 isn't really helpful. It would be better to just say 2^64-1.