As part of my -Wunused-parameter series, I noticed that a few callbacks used with the chunk-format API ignored the "chunk_size" parameter. I had initially just annotated these with UNUSED under the usual "well, not all callbacks need all parameters" logic. But if you think about it, a chunk callback that does not look at the chunk size _must_ be buggy, as there is no way it could ensure that it does not read past the end of the chunk. In a well-formed file this isn't a problem, since most chunks have expected sizes (e.g., a commit-graph has a fixed-size commit-data record for every commit mentioned in its index chunk). But it is very easy to get out-of-bounds reads for files that are not well-formed. I think the security implications here are pretty minor. The only files which use the chunk format are multi-pack-index and commit-graph, neither of which we'd expect to receive over the network (so you'd need to access an untrusted tarball, etc, to even see a malicious file). And we'd never try to write to this memory (they're read-only mmaps of the files). So your worst case is probably an unexpected out-of-bounds read and segfault, on a file that is hard to put in front of the victim. But I think it's still worth fixing. The extra checks aren't very expensive, and let us handle bugs or corruption more gracefully, as well. It's a lot of patches because there are a lot of chunk types. ;) But each one is hopefully pretty straightforward in isolation. I tried to group similar chunks together (e.g., commit-graph and midx both have OIDF and OIDL chunks), but otherwise just fixed the midx chunks in the order they appear in the code, followed by the same for commit-graph. [01/20]: chunk-format: note that pair_chunk() is unsafe [02/20]: t: add library for munging chunk-format files [03/20]: midx: stop ignoring malformed oid fanout chunk [04/20]: commit-graph: check size of oid fanout chunk [05/20]: midx: check size of oid lookup chunk [06/20]: commit-graph: check consistency of fanout table [07/20]: midx: check size of pack names chunk [08/20]: midx: enforce chunk alignment on reading [09/20]: midx: check size of object offset chunk [10/20]: midx: bounds-check large offset chunk [11/20]: midx: check size of revindex chunk [12/20]: commit-graph: check size of commit data chunk [13/20]: commit-graph: detect out-of-bounds extra-edges pointers [14/20]: commit-graph: bounds-check base graphs chunk [15/20]: commit-graph: check size of generations chunk [16/20]: commit-graph: bounds-check generation overflow chunk [17/20]: commit-graph: check bounds when accessing BDAT chunk [18/20]: commit-graph: check bounds when accessing BIDX chunk [19/20]: commit-graph: detect out-of-order BIDX offsets [20/20]: chunk-format: drop pair_chunk_unsafe() bloom.c | 34 +++++++++ chunk-format.c | 24 ++++-- chunk-format.h | 9 ++- commit-graph.c | 119 ++++++++++++++++++++++++----- commit-graph.h | 4 + midx.c | 68 +++++++++++++---- midx.h | 3 + pack-revindex.c | 13 +++- t/lib-chunk.sh | 17 +++++ t/lib-chunk/corrupt-chunk-file.pl | 66 ++++++++++++++++ t/t4216-log-bloom.sh | 50 ++++++++++++ t/t5318-commit-graph.sh | 76 +++++++++++++++++- t/t5319-multi-pack-index.sh | 102 ++++++++++++++++++++++++- t/t5324-split-commit-graph.sh | 20 ++++- t/t5328-commit-graph-64bit-time.sh | 10 +++ 15 files changed, 568 insertions(+), 47 deletions(-) create mode 100644 t/lib-chunk.sh create mode 100644 t/lib-chunk/corrupt-chunk-file.pl -Peff