Derrick Stolee <dstolee@xxxxxxxxxxxxx> writes: > If core.commitGraph is true, verify the contents of the commit-graph > during 'git fsck' using the 'git commit-graph verify' subcommand. Run > this check on all alternates, as well. All right, so we have one config variable to control the use of serialized commit-graph feaature. Nice. > > We use a new process for two reasons: > > 1. The subcommand decouples the details of loading and verifying a > commit-graph file from the other fsck details. All right, I can agree with that. On the other hand using subcommand makes debugging harder, though not in this case (well separated functionality that can be easily called with a standalone command to be debugged). > > 2. The commit-graph verification requires the commits to be loaded > in a specific order to guarantee we parse from the commit-graph > file for some objects and from the object database for others. I don't quite understand this. Could you explain it in more detail? > > Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > --- > Documentation/git-fsck.txt | 3 +++ > builtin/fsck.c | 21 +++++++++++++++++++++ > t/t5318-commit-graph.sh | 8 ++++++++ > 3 files changed, 32 insertions(+) > > diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt > index b9f060e3b2..ab9a93fb9b 100644 > --- a/Documentation/git-fsck.txt > +++ b/Documentation/git-fsck.txt > @@ -110,6 +110,9 @@ Any corrupt objects you will have to find in backups or other archives > (i.e., you can just remove them and do an 'rsync' with some other site in > the hopes that somebody else has the object you have corrupted). > > +If core.commitGraph is true, the commit-graph file will also be inspected Shouldn't we use `core.commitGraph` here? > +using 'git commit-graph verify'. See linkgit:git-commit-graph[1]. > + > Extracted Diagnostics > --------------------- > > diff --git a/builtin/fsck.c b/builtin/fsck.c > index ef78c6c00c..a6d5045b77 100644 > --- a/builtin/fsck.c > +++ b/builtin/fsck.c > @@ -16,6 +16,7 @@ > #include "streaming.h" > #include "decorate.h" > #include "packfile.h" > +#include "run-command.h" > > #define REACHABLE 0x0001 > #define SEEN 0x0002 > @@ -45,6 +46,7 @@ static int name_objects; > #define ERROR_REACHABLE 02 > #define ERROR_PACK 04 > #define ERROR_REFS 010 > +#define ERROR_COMMIT_GRAPH 020 Minor nitpick and a sidenote: I wonder if it wouldn't be better to either use hexadecimal constants, or use (1 << n) for all ERROR_* preprocesor constants. > > static const char *describe_object(struct object *obj) > { > @@ -815,5 +817,24 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) > } > > check_connectivity(); > + > + if (core_commit_graph) { > + struct child_process commit_graph_verify = CHILD_PROCESS_INIT; > + const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL, NULL }; I see that NULL at index 2 and 3 (at 3rd and 4th place) are here for "--object-dir" and <alternates-object-dir-path>, the last one is terminator for that case, but what is next to last NULL (at 5th place) for? > + commit_graph_verify.argv = verify_argv; > + commit_graph_verify.git_cmd = 1; > + > + if (run_command(&commit_graph_verify)) > + errors_found |= ERROR_COMMIT_GRAPH; > + > + prepare_alt_odb(); > + for (alt = alt_odb_list; alt; alt = alt->next) { > + verify_argv[2] = "--object-dir"; > + verify_argv[3] = alt->path; > + if (run_command(&commit_graph_verify)) > + errors_found |= ERROR_COMMIT_GRAPH; > + } > + } For performance reasons it may be better to start those 'git commit-graph verify' commands asynchronously earlier, so that they can run in parallel / concurrently wth other checks, and wait for them and get their error code at the end of git-fsck run. But that is probably better left for a separate commit. > + > return errors_found; > } > diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh > index 2680a2ebff..4941937163 100755 > --- a/t/t5318-commit-graph.sh > +++ b/t/t5318-commit-graph.sh > @@ -394,4 +394,12 @@ test_expect_success 'detect invalid checksum hash' ' > "incorrect checksum" > ' > > +test_expect_success 'git fsck (checks commit-graph)' ' > + cd "$TRASH_DIRECTORY/full" && > + git fsck && > + corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \ > + "incorrect checksum" && > + test_must_fail git fsck > +' All right; though the same caveats apply as with previous commit in series. Perhaps it would be better to truncate commit-graph file, or corrupt it in some 'random' place. > + > test_done Best, -- Jakub Narębski