On Mon, Oct 09, 2023 at 05:05:41PM -0400, Jeff King wrote: > When we are loading a commit-graph chain, we check that each slice of the > chain points to the appropriate set of base graphs via its BASE chunk. > But since we don't record the size of the chunk, we may access > out-of-bounds memory if the file is corrupted. > > Since we know the number of entries we expect to find (based on the > position within the commit-graph-chain file), we can just check the size > up front. > > In theory this would also let us drop the st_mult() call a few lines > later when we actually access the memory, since we know that the > computed offset will fit in a size_t. But because the operands > "g->hash_len" and "n" have types "unsigned char" and "int", we'd have to > cast to size_t first. Leaving the st_mult() does that cast, and makes it > more obvious that we don't have an overflow problem. > > Note that the test does not actually segfault before this patch, since > it just reads garbage from the chunk after BASE (and indeed, it even > rejects the file because that garbage does not have the expected hash > value). You could construct a file with BASE at the end that did > segfault, but corrupting the existing one is easy, and we can check > stderr for the expected message. > > Signed-off-by: Jeff King <peff@xxxxxxxx> > --- > commit-graph.c | 8 +++++++- > commit-graph.h | 1 + > t/t5324-split-commit-graph.sh | 14 ++++++++++++++ > 3 files changed, 22 insertions(+), 1 deletion(-) > > diff --git a/commit-graph.c b/commit-graph.c > index e4860841fc..4377b547c8 100644 > --- a/commit-graph.c > +++ b/commit-graph.c > @@ -435,7 +435,8 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s, > read_chunk(cf, GRAPH_CHUNKID_DATA, graph_read_commit_data, graph); > pair_chunk(cf, GRAPH_CHUNKID_EXTRAEDGES, &graph->chunk_extra_edges, > &graph->chunk_extra_edges_size); > - pair_chunk_unsafe(cf, GRAPH_CHUNKID_BASE, &graph->chunk_base_graphs); > + pair_chunk(cf, GRAPH_CHUNKID_BASE, &graph->chunk_base_graphs, > + &graph->chunk_base_graphs_size); > > if (s->commit_graph_generation_version >= 2) { > pair_chunk_unsafe(cf, GRAPH_CHUNKID_GENERATION_DATA, > @@ -546,6 +547,11 @@ static int add_graph_to_chain(struct commit_graph *g, > return 0; > } > > + if (g->chunk_base_graphs_size / g->hash_len < n) { > + warning(_("commit-graph base graphs chunk is too small")); > + return 0; > + } > + Nice. Here's a spot where we would not benefit from a function like `pair_chunk_expect()`, since we don't know about the chain when we are parsing an individual layer of it. So storing the length off to the side and checking it within `add_graph_to_chain()` makes sense. Thanks, Taylor