Derrick Stolee <dstolee@xxxxxxxxxxxxx> writes: > If the commit-graph file becomes corrupt, we need a way to verify > that its contents match the object database. In the manner of > 'git fsck' we will implement a 'git commit-graph verify' subcommand > to report all issues with the file. > > Add the 'verify' subcommand to the 'commit-graph' builtin and its > documentation. The subcommand is currently a no-op except for > loading the commit-graph into memory, which may trigger run-time > errors that would be caught by normal use. So this commit is simply getting the boilerplate out of the way for implementing 'git commit-graph verify' subcommand. Good. > Add a simple test that > ensures the command returns a zero error code. Nice. > > If no commit-graph file exists, this is an acceptable state. Do > not report any errors. All right. I assume that as it is explicit verification call, it does ignore core.commitGraph setting, isn't it? > > Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > --- > Documentation/git-commit-graph.txt | 6 ++++++ > builtin/commit-graph.c | 38 ++++++++++++++++++++++++++++++++++++++ > commit-graph.c | 26 ++++++++++++++++++++++++++ > commit-graph.h | 2 ++ > t/t5318-commit-graph.sh | 10 ++++++++++ > 5 files changed, 82 insertions(+) > > diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt > index 4c97b555cc..a222cfab08 100644 > --- a/Documentation/git-commit-graph.txt > +++ b/Documentation/git-commit-graph.txt > @@ -10,6 +10,7 @@ SYNOPSIS > -------- > [verse] > 'git commit-graph read' [--object-dir <dir>] > +'git commit-graph verify' [--object-dir <dir>] > 'git commit-graph write' <options> [--object-dir <dir>] In alphabetical order, good. > > > @@ -52,6 +53,11 @@ existing commit-graph file. > Read a graph file given by the commit-graph file and output basic > details about the graph file. Used for debugging purposes. > > +'verify':: > + > +Read the commit-graph file and verify its contents against the object > +database. Used to check for corrupted data. > + All right, good enough description. > > EXAMPLES > -------- > diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c > index f0875b8bf3..0433dd6e20 100644 > --- a/builtin/commit-graph.c > +++ b/builtin/commit-graph.c > @@ -8,10 +8,16 @@ > static char const * const builtin_commit_graph_usage[] = { > N_("git commit-graph [--object-dir <objdir>]"), > N_("git commit-graph read [--object-dir <objdir>]"), > + N_("git commit-graph verify [--object-dir <objdir>]"), > N_("git commit-graph write [--object-dir <objdir>] [--append] [--stdin-packs|--stdin-commits]"), > NULL > }; In alphabetical order, same as in the manpage for git-commit-graph. > > +static const char * const builtin_commit_graph_verify_usage[] = { > + N_("git commit-graph verify [--object-dir <objdir>]"), > + NULL > +}; > + > static const char * const builtin_commit_graph_read_usage[] = { > N_("git commit-graph read [--object-dir <objdir>]"), > NULL > @@ -29,6 +35,36 @@ static struct opts_commit_graph { > int append; > } opts; > > + > +static int graph_verify(int argc, const char **argv) > +{ > + struct commit_graph *graph = 0; > + char *graph_name; > + > + static struct option builtin_commit_graph_verify_options[] = { > + OPT_STRING(0, "object-dir", &opts.obj_dir, > + N_("dir"), > + N_("The object directory to store the graph")), > + OPT_END(), > + }; > + > + argc = parse_options(argc, argv, NULL, > + builtin_commit_graph_verify_options, > + builtin_commit_graph_verify_usage, 0); > + > + if (!opts.obj_dir) > + opts.obj_dir = get_object_directory(); Getting the boilerplate of implementing the command mostly out of the way. Good. > + > + graph_name = get_commit_graph_filename(opts.obj_dir); > + graph = load_commit_graph_one(graph_name); So we are verifying only the commit-graph file belonging directly to current repository, as I have expected. This is needed to for warnings and error messages from the 'verify' action, to be able to tell in which file there are problems. This means that it is possible that there would be problems with commit-graph files that running 'git commit-graph verify' would not find, because they are in commit-graph file in one of the alternates. It is very easy, though, to check all commit-graph files that would be read and its data concatenated when using commit-graph feature (e.g. 'git commit-graph read', IIRC): $ git commit-graph verify $ for obj_dir in $(cat .git/objects/info/alternates) do; git commit-graph --object-dir="$obj_dir"; done Note: I have not checked the above that it works. > + FREE_AND_NULL(graph_name); Freeing the resources, always nice to have. > + > + if (!graph) > + return 0; DS> If no commit-graph file exists, this is an acceptable state. Do DS> not report any errors. Right, non existant commit-graph file is certainly valid ;-) > + > + return verify_commit_graph(graph); > +} I guess that graph_verify() would not change much, if at all, in subsequent commits in this patch series. > + > static int graph_read(int argc, const char **argv) > { > struct commit_graph *graph = NULL; > @@ -163,6 +199,8 @@ int cmd_commit_graph(int argc, const char **argv, const char *prefix) > PARSE_OPT_STOP_AT_NON_OPTION); > > if (argc > 0) { > + if (!strcmp(argv[0], "verify")) > + return graph_verify(argc, argv); > if (!strcmp(argv[0], "read")) > return graph_read(argc, argv); > if (!strcmp(argv[0], "write")) Not in alphabetical order... is there a reason for that? > diff --git a/commit-graph.c b/commit-graph.c > index 25893ec096..55b41664ee 100644 > --- a/commit-graph.c > +++ b/commit-graph.c > @@ -836,3 +836,29 @@ void write_commit_graph(const char *obj_dir, > oids.alloc = 0; > oids.nr = 0; > } > + > +static int verify_commit_graph_error; > + > +static void graph_report(const char *fmt, ...) > +{ > + va_list ap; > + struct strbuf sb = STRBUF_INIT; > + verify_commit_graph_error = 1; > + > + va_start(ap, fmt); > + strbuf_vaddf(&sb, fmt, ap); > + > + fprintf(stderr, "%s\n", sb.buf); > + strbuf_release(&sb); > + va_end(ap); Why do you use strbuf_vaddf + fprintf instead of straighforward vfprintf (or function instead of variable-level macro)? Is it because of [string] safety? > +} > + > +int verify_commit_graph(struct commit_graph *g) > +{ > + if (!g) { > + graph_report("no commit-graph file loaded"); > + return 1; > + } All right, this is just a placeholder - we should not ever get this message because in this case we exit with error code of 0 (EXIT_SUCCESS) if there is no commit-graph file loaded before invoking verify_commit_graph(). > + > + return verify_commit_graph_error; All right, this is for the future. Good. > +} > diff --git a/commit-graph.h b/commit-graph.h > index 96cccb10f3..71a39c5a57 100644 > --- a/commit-graph.h > +++ b/commit-graph.h > @@ -53,4 +53,6 @@ void write_commit_graph(const char *obj_dir, > int nr_commits, > int append); > > +int verify_commit_graph(struct commit_graph *g); > + Why does this need to be exported? I think it is not used outside of commit-graph.c, isn't it? > #endif > diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh > index 77d85aefe7..6ca451dfd2 100755 > --- a/t/t5318-commit-graph.sh > +++ b/t/t5318-commit-graph.sh > @@ -11,6 +11,11 @@ test_expect_success 'setup full repo' ' > objdir=".git/objects" > ' > > +test_expect_success 'verify graph with no graph file' ' > + cd "$TRASH_DIRECTORY/full" && Is sich bare `cd`, without corresponding `cd` back or using subshell safe? > + git commit-graph verify > +' > + > test_expect_success 'write graph with no packs' ' > cd "$TRASH_DIRECTORY/full" && > git commit-graph write --object-dir . && > @@ -230,4 +235,9 @@ test_expect_success 'perform fast-forward merge in full repo' ' > test_cmp expect output > ' > > +test_expect_success 'git commit-graph verify' ' > + cd "$TRASH_DIRECTORY/full" && > + git commit-graph verify >output > +' Those are tests with nearly the same code, but they are (by their descriptions) testing different things. This means that they rely on side effects of earlier tests. This is suboptimal, as it means that it would be impossible or very difficult to run individual tests (e.g. with GIT_SKIP_TESTS environment variable, or with an individual test suite --run option), unless you know which tests setup the repository state for later tests. It also means that running only failed tests with prove --state=failed,save or equivalently with $ make DEFAULT_TEST_TARGET=prove GIT_PROVE_OPTS='--state=failed,save' test wouldn't work correctly. As Johannes Schindelin (alias Dscho) said in latest Git Rev News interview: https://git.github.io/rev_news/2018/05/16/edition-39/ JS> We have a test suite where debugging a regression may mean that you JS> have to run 98 test cases before the failing one every single time in JS> the edit/compile/debug cycle, because the 99th test case may depend on JS> a side effect of at least one of the preceding test cases. Git’s test JS> suite is so not [21st century best practices][1]. JS> JS> [1]: https://www.slideshare.net/BuckHodges/lessons-learned-doing-devops-at-scale-at-microsoft I think can be solved quite efficiently by creating and using shell function, or two shell functions, which would either: * rename commit-graph file to some other temporary name if it exists, and move it back after the test. * create commit-graph file if it does not exist. For example (untested): prepare_no_commit_graph() { mv .git/info/commit-graph .git/info/commit-graph.away && test_when_finished "mv .git/info/commit-graph.away .git/info/commit-graph" } prepare_commit_graph() { if ! test -f ".git/info/commit-graph" then git commit-graph write fi } Or something like that. > + > test_done Regards, -- Jakub Narębski