On Fri, Mar 22 2019, Ævar Arnfjörð Bjarmason wrote: > On Fri, Mar 22 2019, SZEDER Gábor wrote: > >> On Fri, Mar 22, 2019 at 12:11:26PM +0100, Ævar Arnfjörð Bjarmason wrote: >>> >>> On Fri, Mar 22 2019, SZEDER Gábor wrote: >>> >>> > Commit 49bbc57a57 (commit-graph write: emit a percentage for all >>> > progress, 2019-01-19) was a bit overeager when it added progress >>> > percentages to the "Expanding reachable commits in commit graph" phase >>> > as well, because most of the time the number of commits that phase has >>> > to iterate over is not known in advance and grows significantly, and, >>> > consequently, we end up with nonsensical numbers: >>> > >>> > $ git commit-graph write --reachable >>> > Expanding reachable commits in commit graph: 138606% (824706/595), done. >>> > [...] >>> > >>> > $ git rev-parse v5.0 | git commit-graph write --stdin-commits >>> > Expanding reachable commits in commit graph: 81264400% (812644/1), done. >>> > [...] >>> > >>> > Therefore, don't show progress percentages in the "Expanding reachable >>> > commits in commit graph" phase. >>> >>> There's indeed a bug here as your examples show, but there *are* cases >>> where it's correct, as the commit message for my patch on "master" shows >>> there's cases where we correctly. >>> >>> So this "fixes" things by always removing the progress, why not instead >>> pass down the state to close_reachable() about what we're walking over, >>> so we can always show progress, or at least in some cases? >> >> The cases where it does display correct percentages are exceptional, >> and doesn't worth the effort to try to find out whether ther current >> operation happens to be such a case. > > It's the "write" entry point without arguments that displays the correct > progress. So not exceptional, but yeah, it's not what we use on "gc". > > In any case, the problem is that sometimes we've walked the full set of > commits already, and some other times we haven't. > > So in cases where we have we can show progress, and as a TODO (I think > this came up in previous discussions), we could do better if we had a > approximate_commit_count(). > > In any case, the below fix seems correct to me, but I haven't poked it > much. It *does* suffer from a theoretical race with the progress bar > similar to d9b1b309cf ("commit-graph write: show progress for object > search", 2019-01-19), but I work around it in the same way: So double brainfart there, I meant "race with new objects being added/removed", but .... > diff --git a/commit-graph.c b/commit-graph.c > index 47e9be0a3a..0fab3d8b2b 100644 > --- a/commit-graph.c > +++ b/commit-graph.c > @@ -693,7 +693,8 @@ static void add_missing_parents(struct packed_oid_list *oids, struct commit *com > } > } > > -static void close_reachable(struct packed_oid_list *oids, int report_progress) > +static void close_reachable(struct packed_oid_list *oids, int report_progress, > + uint64_t oids_count_for_progress) > { > int i; > struct commit *commit; > @@ -717,7 +718,8 @@ static void close_reachable(struct packed_oid_list *oids, int report_progress) > */ > if (report_progress) > progress = start_delayed_progress( > - _("Expanding reachable commits in commit graph"), oids->nr); > + _("Expanding reachable commits in commit graph"), > + oids_count_for_progress); > for (i = 0; i < oids->nr; i++) { > display_progress(progress, i + 1); > commit = lookup_commit(the_repository, &oids->list[i]); > @@ -725,6 +727,8 @@ static void close_reachable(struct packed_oid_list *oids, int report_progress) > if (commit && !parse_commit(commit)) > add_missing_parents(oids, commit); > } > + if (oids->nr < oids_count_for_progress) > + display_progress(progress, oids_count_for_progress); ...that race exists for the approximate_object_count case in d9b1b309cf because in the time between approximate_object_count() and our second walk via for_each_packed_object() the packs may have changed (e.g. concurrent "git push"). But this won't happen in this case, because we already know the commits we want to seed from, and the only thing that will "change" is that we haven't walked the parents of some of those commits when we e.g. seed from just HEAD. But when oids_count_for_progress is >0 here we've walked all the commits we found in the relevant packs already, it's not possible that they've gained new parents in the interim, so this "oids->nr < oids_count_for_progress" case here is pointless cargo-culting. > stop_progress(&progress); > > if (report_progress) > @@ -829,6 +833,7 @@ void write_commit_graph(const char *obj_dir, > uint64_t progress_cnt = 0; > struct strbuf progress_title = STRBUF_INIT; > unsigned long approx_nr_objects; > + uint64_t oids_count_for_progress = 0; > > if (!commit_graph_compatible(the_repository)) > return; > @@ -934,9 +939,10 @@ void write_commit_graph(const char *obj_dir, > if (oids.progress_done < approx_nr_objects) > display_progress(oids.progress, approx_nr_objects); > stop_progress(&oids.progress); > + oids_count_for_progress = oids.nr; > } > > - close_reachable(&oids, report_progress); > + close_reachable(&oids, report_progress, oids_count_for_progress); > > if (report_progress) > progress = start_delayed_progress(