Based on feedback on v1, and the "this is yelling at my users through gc.log" bug I found. Range-diff with v1: 1: e0a09ad641 ! 1: b2dcfa0f55 commit-graph write: add progress output @@ -5,8 +5,8 @@ Before this change the "commit-graph write" command didn't report any progress. On my machine this command takes more than 10 seconds to write the graph for linux.git, and around 1m30s on the - 2015-04-03-1M-git.git[1] test repository, which is a test case for - larger monorepos. + 2015-04-03-1M-git.git[1] test repository (a test case for a large + monorepository). Furthermore, since the gc.writeCommitGraph setting was added in d5d5d7b641 ("gc: automatically write commit-graph files", 2018-06-27), @@ -19,7 +19,6 @@ $ git -c gc.writeCommitGraph=true gc Enumerating objects: 2821, done. [...] - Total 2821 (delta 1670), reused 2821 (delta 1670) Computing commit graph generation numbers: 100% (867/867), done. On larger repositories, such as linux.git the delayed progress bar(s) @@ -27,6 +26,7 @@ previously happening, printing nothing while we write the graph: $ git -c gc.writeCommitGraph=true gc + [...] Annotating commits in commit graph: 1565573, done. Computing commit graph generation numbers: 100% (782484/782484), done. @@ -42,20 +42,90 @@ With --stdin-packs we don't show any estimation of how much is left to do. This is because we might be processing more than one pack. We - could be less lazy here and show progress, either detect by detecting - that we're only processing one pack, or by first looping over the - packs to discover how many commits they have. I don't see the point in - doing that work. So instead we get (on 2015-04-03-1M-git.git): + could be less lazy here and show progress, either by detecting that + we're only processing one pack, or by first looping over the packs to + discover how many commits they have. I don't see the point in doing + that work. So instead we get (on 2015-04-03-1M-git.git): $ echo pack-<HASH>.idx | git -c gc.writeCommitGraph=true --exec-path=$PWD commit-graph write --stdin-packs Finding commits for commit graph: 13064614, done. Annotating commits in commit graph: 3001341, done. Computing commit graph generation numbers: 100% (1000447/1000447), done. + No GC mode uses --stdin-packs. It's what they use at Microsoft to + manually compute the generation numbers for their collection of large + packs which are never coalesced. + + The reason we need a "report_progress" variable passed down from "git + gc" is so that we don't report this output when we're running in the + process "git gc --auto" detaches from the terminal. + + Since we write the commit graph from the "git gc" process itself (as + opposed to what we do with say the "git repack" phase), we'd end up + writing the output to .git/gc.log and reporting it to the user next + time as part of the "The last gc run reported the following[...]" + error, see 329e6e8794 ("gc: save log from daemonized gc --auto and + print it next time", 2015-09-19). + + So we must keep track of whether or not we're running in that + demonized mode, and if so print no progress. + + See [2] and subsequent replies for a discussion of an approach not + taken in compute_generation_numbers(). I.e. we're saying "Computing + commit graph generation numbers", even though on an established + history we're mostly skipping over all the work we did in the + past. This is similar to the white lie we tell in the "Writing + objects" phase (not all are objects being written). + + Always showing progress is considered more important than + accuracy. I.e. on a repository like 2015-04-03-1M-git.git we'd hang + for 6 seconds with no output on the second "git gc" if no changes were + made to any objects in the interim if we'd take the approach in [2]. + 1. https://github.com/avar/2015-04-03-1M-git + 2. <c6960252-c095-fb2b-e0bc-b1e6bb261614@xxxxxxxxx> + (https://public-inbox.org/git/c6960252-c095-fb2b-e0bc-b1e6bb261614@xxxxxxxxx/) + Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> + diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c + --- a/builtin/commit-graph.c + +++ b/builtin/commit-graph.c +@@ + opts.obj_dir = get_object_directory(); + + if (opts.reachable) { +- write_commit_graph_reachable(opts.obj_dir, opts.append); ++ write_commit_graph_reachable(opts.obj_dir, opts.append, 1); + return 0; + } + +@@ + write_commit_graph(opts.obj_dir, + pack_indexes, + commit_hex, +- opts.append); ++ opts.append, ++ 1); + + string_list_clear(&lines, 0); + return 0; + + diff --git a/builtin/gc.c b/builtin/gc.c + --- a/builtin/gc.c + +++ b/builtin/gc.c +@@ + clean_pack_garbage(); + + if (gc_write_commit_graph) +- write_commit_graph_reachable(get_object_directory(), 0); ++ write_commit_graph_reachable(get_object_directory(), 0, ++ !daemonized); + + if (auto_gc && too_many_loose_objects()) + warning(_("There are too many unreachable loose objects; " + diff --git a/commit-graph.c b/commit-graph.c --- a/commit-graph.c +++ b/commit-graph.c @@ -87,14 +157,20 @@ if (packed_object_info(the_repository, pack, offset, &oi) < 0) die(_("unable to get type of object %s"), oid_to_hex(oid)); @@ + } + } + +-static void close_reachable(struct packed_oid_list *oids) ++static void close_reachable(struct packed_oid_list *oids, int report_progress) { int i; struct commit *commit; + struct progress *progress = NULL; + int j = 0; -+ progress = start_delayed_progress( -+ _("Annotating commits in commit graph"), 0); ++ if (report_progress) ++ progress = start_delayed_progress( ++ _("Annotating commits in commit graph"), 0); for (i = 0; i < oids->nr; i++) { + display_progress(progress, ++j); commit = lookup_commit(the_repository, &oids->list[i]); @@ -121,16 +197,20 @@ + stop_progress(&progress); } - static void compute_generation_numbers(struct packed_commit_list* commits) +-static void compute_generation_numbers(struct packed_commit_list* commits) ++static void compute_generation_numbers(struct packed_commit_list* commits, ++ int report_progress) { int i; struct commit_list *list = NULL; + struct progress *progress = NULL; -+ progress = start_progress( -+ _("Computing commit graph generation numbers"), commits->nr); ++ if (report_progress) ++ progress = start_progress( ++ _("Computing commit graph generation numbers"), ++ commits->nr); for (i = 0; i < commits->nr; i++) { -+ display_progress(progress, i); ++ display_progress(progress, i + 1); if (commits->list[i]->generation != GENERATION_NUMBER_INFINITY && commits->list[i]->generation != GENERATION_NUMBER_ZERO) continue; @@ -138,11 +218,34 @@ } } } -+ display_progress(progress, i); + stop_progress(&progress); } static int add_ref_to_list(const char *refname, +@@ + return 0; + } + +-void write_commit_graph_reachable(const char *obj_dir, int append) ++void write_commit_graph_reachable(const char *obj_dir, int append, ++ int report_progress) + { + struct string_list list; + + string_list_init(&list, 1); + for_each_ref(add_ref_to_list, &list); +- write_commit_graph(obj_dir, NULL, &list, append); ++ write_commit_graph(obj_dir, NULL, &list, append, report_progress); + } + + void write_commit_graph(const char *obj_dir, + struct string_list *pack_indexes, + struct string_list *commit_hex, +- int append) ++ int append, int report_progress) + { + struct packed_oid_list oids; + struct packed_commit_list commits; @@ int num_chunks; int num_extra_edges; @@ -160,9 +263,11 @@ int dirlen; strbuf_addf(&packname, "%s/pack/", obj_dir); dirlen = packname.len; -+ oids.progress = start_delayed_progress( -+ _("Finding commits for commit graph"), 0); -+ oids.progress_done = 0; ++ if (report_progress) { ++ oids.progress = start_delayed_progress( ++ _("Finding commits for commit graph"), 0); ++ oids.progress_done = 0; ++ } for (i = 0; i < pack_indexes->nr; i++) { struct packed_git *p; strbuf_setlen(&packname, dirlen); @@ -175,14 +280,16 @@ } if (commit_hex) { -+ progress = start_delayed_progress( -+ _("Finding commits for commit graph"), commit_hex->nr); ++ if (report_progress) ++ progress = start_delayed_progress( ++ _("Finding commits for commit graph"), ++ commit_hex->nr); for (i = 0; i < commit_hex->nr; i++) { const char *end; struct object_id oid; struct commit *result; -+ display_progress(progress, i); ++ display_progress(progress, i + 1); if (commit_hex->items[i].string && parse_oid_hex(commit_hex->items[i].string, &oid, &end)) continue; @@ -190,17 +297,48 @@ oids.nr++; } } -+ display_progress(progress, i); + stop_progress(&progress); } - if (!pack_indexes && !commit_hex) + if (!pack_indexes && !commit_hex) { -+ oids.progress = start_delayed_progress( -+ _("Finding commits for commit graph"), 0); ++ if (report_progress) ++ oids.progress = start_delayed_progress( ++ _("Finding commits for commit graph"), 0); for_each_packed_object(add_packed_commits, &oids, 0); + stop_progress(&oids.progress); + } - close_reachable(&oids); +- close_reachable(&oids); ++ close_reachable(&oids, report_progress); + + QSORT(oids.list, oids.nr, commit_compare); + +@@ + if (commits.nr >= GRAPH_PARENT_MISSING) + die(_("too many commits to write graph")); + +- compute_generation_numbers(&commits); ++ compute_generation_numbers(&commits, report_progress); + + graph_name = get_commit_graph_filename(obj_dir); + if (safe_create_leading_directories(graph_name)) + + diff --git a/commit-graph.h b/commit-graph.h + --- a/commit-graph.h + +++ b/commit-graph.h +@@ + + struct commit_graph *load_commit_graph_one(const char *graph_file); + +-void write_commit_graph_reachable(const char *obj_dir, int append); ++void write_commit_graph_reachable(const char *obj_dir, int append, ++ int report_progress); + void write_commit_graph(const char *obj_dir, + struct string_list *pack_indexes, + struct string_list *commit_hex, +- int append); ++ int append, int report_progress); + + int verify_commit_graph(struct repository *r, struct commit_graph *g); 2: a364297d15 ! 2: 775237cffb commit-graph verify: add progress output @@ -23,6 +23,10 @@ real 0m7.813s [...] + Since the "commit-graph verify" subcommand is never called from "git + gc", we don't have to worry about passing some some "report_progress" + progress variable around for this codepath. + Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> diff --git a/commit-graph.c b/commit-graph.c @@ -47,7 +51,7 @@ struct commit_list *graph_parents, *odb_parents; uint32_t max_generation = 0; -+ display_progress(progress, i); ++ display_progress(progress, i + 1); hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); graph_commit = lookup_commit(r, &cur_oid); @@ -55,7 +59,6 @@ graph_commit->date, odb_commit->date); } -+ display_progress(progress, i); + stop_progress(&progress); return verify_commit_graph_error; Ævar Arnfjörð Bjarmason (2): commit-graph write: add progress output commit-graph verify: add progress output builtin/commit-graph.c | 5 ++-- builtin/gc.c | 3 +- commit-graph.c | 65 ++++++++++++++++++++++++++++++++++++------ commit-graph.h | 5 ++-- 4 files changed, 65 insertions(+), 13 deletions(-) -- 2.19.0.rc1.350.ge57e33dbd1