Derrick Stolee <dstolee@xxxxxxxxxxxxx> writes: > The 'verify' subcommand must compare the commit content parsed from the > commit-graph and compare it against the content in the object database. You have "compare" twice in the above sentence. > Use lookup_commit() and parse_commit_in_graph_one() to parse the commits > from the graph and compare against a commit that is loaded separately > and parsed directly from the object database. All right, that looks like a nice extension of what was done in previous patch. We want to check that cache (serialized commit graph) matches reality (object database). > > Add checks for the root tree OID. All right; isn't it that now we check almost all information from commit-graph that hs match in object database (with exception of commit parents, I think). > > Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > --- > commit-graph.c | 17 ++++++++++++++++- > t/t5318-commit-graph.sh | 7 +++++++ > 2 files changed, 23 insertions(+), 1 deletion(-) > > diff --git a/commit-graph.c b/commit-graph.c > index 0420ebcd87..19ea369fc6 100644 > --- a/commit-graph.c > +++ b/commit-graph.c > @@ -880,6 +880,8 @@ int verify_commit_graph(struct commit_graph *g) > return verify_commit_graph_error; NOTE: we will be checking Commit Data chunk; I think it would be good idea to verify that size of Commit Data chunk matches (N * (H + 16) bytes) that format gives us, so that we don't accidentally red outside of memory if something got screwed up (like number of commits being wrong, or file truncated). > > for (i = 0; i < g->num_commits; i++) { > + struct commit *graph_commit; > + > hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); > > if (i && oidcmp(&prev_oid, &cur_oid) >= 0) > @@ -897,6 +899,11 @@ int verify_commit_graph(struct commit_graph *g) > > cur_fanout_pos++; > } > + > + graph_commit = lookup_commit(&cur_oid); So now I see why we add all commits to memory (to hash structure). > + if (!parse_commit_in_graph_one(g, graph_commit)) > + graph_report("failed to parse %s from commit-graph", > + oid_to_hex(&cur_oid)); All right, this verifies that commit in OID Lookup chunk has parse-able data in Commit Data chunk, isn't it? > } > > while (cur_fanout_pos < 256) { > @@ -913,16 +920,24 @@ int verify_commit_graph(struct commit_graph *g) > return verify_commit_graph_error; > > for (i = 0; i < g->num_commits; i++) { > - struct commit *odb_commit; > + struct commit *graph_commit, *odb_commit; > > hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); > > + graph_commit = lookup_commit(&cur_oid); > odb_commit = (struct commit *)create_object(cur_oid.hash, alloc_commit_node()); All right, so we have commit data from graph, and commit data from the object database. > if (parse_commit_internal(odb_commit, 0, 0)) { > graph_report("failed to parse %s from object database", > oid_to_hex(&cur_oid)); > continue; > } > + > + if (oidcmp(&get_commit_tree_in_graph_one(g, graph_commit)->object.oid, > + get_commit_tree_oid(odb_commit))) > + graph_report("root tree OID for commit %s in commit-graph is %s != %s", > + oid_to_hex(&cur_oid), > + oid_to_hex(get_commit_tree_oid(graph_commit)), > + oid_to_hex(get_commit_tree_oid(odb_commit))); Maybe explicitly say that it doesn't match the value from the object database; OTOH this may be too verbose. > } > > return verify_commit_graph_error; > diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh > index 996a016239..21cc8e82f3 100755 > --- a/t/t5318-commit-graph.sh > +++ b/t/t5318-commit-graph.sh > @@ -267,6 +267,8 @@ GRAPH_BYTE_FANOUT2=`expr $GRAPH_FANOUT_OFFSET + 4 \* 255` > GRAPH_OID_LOOKUP_OFFSET=`expr $GRAPH_FANOUT_OFFSET + 4 \* 256` > GRAPH_BYTE_OID_LOOKUP_ORDER=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* 8` > GRAPH_BYTE_OID_LOOKUP_MISSING=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* 4 + 10` > +GRAPH_COMMIT_DATA_OFFSET=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* $NUM_COMMITS` > +GRAPH_BYTE_COMMIT_TREE=$GRAPH_COMMIT_DATA_OFFSET All right, so the first is entry into record in Commit Data chunk, and the latter points into tree entry in this record -- which entry is first field: Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes) * The first H bytes are for the OID of the root tree. > > # usage: corrupt_graph_and_verify <position> <data> <string> > # Manipulates the commit-graph file at the position > @@ -341,4 +343,9 @@ test_expect_success 'detect OID not in object database' ' > "from object database" > ' > > +test_expect_success 'detect incorrect tree OID' ' > + corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_TREE "\01" \ > + "root tree OID for commit" > +' All right. I wonder if we can create a test for first check added (that Commit Graph data parses correctly), that is the one with the following error message: "failed to parse <OID> from commit-graph file". > + > test_done Regards, -- Jakub Narębski