Derrick Stolee <stolee@xxxxxxxxx> writes: >>> + chunk_id = get_be32(table_of_contents); >>> + if (chunk_id) { >>> + error(_("final chunk has non-zero id %"PRIx32""), chunk_id); >>> + return -1; >>> + } >> >> Shouldn't we be validating the size component associated with this >> "id=0" fake chunk that appears at the end as well? No, please disregard this comment, which was based on my incorrect understanding of the "size" field associated with this fake ID==0 chunk (I incorrectly thought the size had something to do with the file header plus TOC, but it is not---it is to allow discovering the size of the last chunk by being a sentinel that records the offset of an extra chunk at the end that does not actually exist). > I like this, but why not just use pair_chunk_fn inside of > the implementation of pair_chunk() so callers have an easy > interface. Yes, I didn't realize that earlier design iteration resulted in the introduction of the "pair_chunk()" after discovering that it often is necessary to just note the address where the data begins, so I am OK to leave something like pair_chunk() as a public interface, and implementing the pair_chunk() helper like you suggest would be a perfectly fine way to do so. It however is curious that the callers who use pair_chunk() do not get the same quality of data as read_chunk() callers. The users of pair_chunk() presumably are not ready to (or simply do not want to) process the data immediately by using read_chunk() with callback, but when they get ready to process the data, unlike read_chunk callbacks, they do not get to learn how much they ought to process---all they learn is the address of the beginning of the chunk. I do not see a way to write pair_chunk() users safely to guarantee that they do not overrun at the tail end of the chunk they are processing. Thanks.