On Fri, Mar 22 2019, SZEDER Gábor wrote: > On Fri, Mar 22, 2019 at 04:11:24PM +0100, Ævar Arnfjörð Bjarmason wrote: >> >> On Fri, Mar 22 2019, SZEDER Gábor wrote: >> >> > On Fri, Mar 22, 2019 at 03:28:34PM +0100, Ævar Arnfjörð Bjarmason wrote: >> >> >> >> On Fri, Mar 22 2019, SZEDER Gábor wrote: >> >> >> >> > On Fri, Mar 22, 2019 at 12:11:26PM +0100, Ævar Arnfjörð Bjarmason wrote: >> >> >> >> >> >> On Fri, Mar 22 2019, SZEDER Gábor wrote: >> >> >> >> >> >> > Commit 49bbc57a57 (commit-graph write: emit a percentage for all >> >> >> > progress, 2019-01-19) was a bit overeager when it added progress >> >> >> > percentages to the "Expanding reachable commits in commit graph" phase >> >> >> > as well, because most of the time the number of commits that phase has >> >> >> > to iterate over is not known in advance and grows significantly, and, >> >> >> > consequently, we end up with nonsensical numbers: >> >> >> > >> >> >> > $ git commit-graph write --reachable >> >> >> > Expanding reachable commits in commit graph: 138606% (824706/595), done. >> >> >> > [...] >> >> >> > >> >> >> > $ git rev-parse v5.0 | git commit-graph write --stdin-commits >> >> >> > Expanding reachable commits in commit graph: 81264400% (812644/1), done. >> >> >> > [...] >> >> >> > >> >> >> > Therefore, don't show progress percentages in the "Expanding reachable >> >> >> > commits in commit graph" phase. >> >> >> >> >> >> There's indeed a bug here as your examples show, but there *are* cases >> >> >> where it's correct, as the commit message for my patch on "master" shows >> >> >> there's cases where we correctly. >> >> >> >> >> >> So this "fixes" things by always removing the progress, why not instead >> >> >> pass down the state to close_reachable() about what we're walking over, >> >> >> so we can always show progress, or at least in some cases? >> >> > >> >> > The cases where it does display correct percentages are exceptional, >> >> > and doesn't worth the effort to try to find out whether ther current >> >> > operation happens to be such a case. >> >> >> >> It's the "write" entry point without arguments that displays the correct >> >> progress. So not exceptional, but yeah, it's not what we use on "gc". >> > >> > Bit it displays the correct number only if all the reachable commits >> > are in packfiles, which is not necessarily the case (e.g. unpacked >> > small packs during 'git fetch'). >> >> No, argument-less "write" only considers packed commits. > > No, it considers packed commits as starting points, and then expands > to all reachable commits, that's what that loop in question is about. > > $ git init > Initialized empty Git repository in /tmp/test/.git/ > $ echo >file > $ git add file > $ git commit -q -m initial > $ echo 1 >file > $ git commit -q -m 1 file > $ git rev-parse HEAD | git pack-objects > .git/objects/pack/pack > Enumerating objects: 1, done. > Counting objects: 100% (1/1), done. > ece3ff72952af2b28e048fa5c58db88c44312876 > Writing objects: 100% (1/1), done. > Total 1 (delta 0), reused 0 (delta 0) > $ git commit-graph write > Computing commit graph generation numbers: 100% (2/2), done. Ah, indeed. I think in practice it'll be unlikely to happen in practice except on servers due to receive.unpackLimit, and then it won't be off by much. So I think it's best to do something like my WIP patch upthread + the "snap to 100%" at the end behavior. I'll try to clean that up / test it / submit that sometime soon. >> >> In any case, the problem is that sometimes we've walked the full set of >> >> commits already, and some other times we haven't. >> > >> > ... and that we can't really be sure whether we've walked the full set >> > of commits until after this loop. >> >> I'm fairly sure we can when we start with a full walk. See my >> explanation in <87imwbc6x8.fsf@xxxxxxxxxxxxxxxxxxx>, but I may have >> missed something. >> >> >> So in cases where we have we can show progress, and as a TODO (I think >> >> this came up in previous discussions), we could do better if we had a >> >> approximate_commit_count(). >> >> >> >> In any case, the below fix seems correct to me, but I haven't poked it >> >> much. It *does* suffer from a theoretical race with the progress bar >> >> similar to d9b1b309cf ("commit-graph write: show progress for object >> >> search", 2019-01-19), but I work around it in the same way: >> >> >> >> diff --git a/commit-graph.c b/commit-graph.c >> >> index 47e9be0a3a..0fab3d8b2b 100644 >> >> --- a/commit-graph.c >> >> +++ b/commit-graph.c >> >> @@ -693,7 +693,8 @@ static void add_missing_parents(struct packed_oid_list *oids, struct commit *com >> >> } >> >> } >> >> >> >> -static void close_reachable(struct packed_oid_list *oids, int report_progress) >> >> +static void close_reachable(struct packed_oid_list *oids, int report_progress, >> >> + uint64_t oids_count_for_progress) >> >> { >> >> int i; >> >> struct commit *commit; >> >> @@ -717,7 +718,8 @@ static void close_reachable(struct packed_oid_list *oids, int report_progress) >> >> */ >> >> if (report_progress) >> >> progress = start_delayed_progress( >> >> - _("Expanding reachable commits in commit graph"), oids->nr); >> >> + _("Expanding reachable commits in commit graph"), >> >> + oids_count_for_progress); >> >> for (i = 0; i < oids->nr; i++) { >> >> display_progress(progress, i + 1); >> >> commit = lookup_commit(the_repository, &oids->list[i]); >> >> @@ -725,6 +727,8 @@ static void close_reachable(struct packed_oid_list *oids, int report_progress) >> >> if (commit && !parse_commit(commit)) >> >> add_missing_parents(oids, commit); >> >> } >> >> + if (oids->nr < oids_count_for_progress) >> >> + display_progress(progress, oids_count_for_progress); >> >> stop_progress(&progress); >> >> >> >> if (report_progress) >> >> @@ -829,6 +833,7 @@ void write_commit_graph(const char *obj_dir, >> >> uint64_t progress_cnt = 0; >> >> struct strbuf progress_title = STRBUF_INIT; >> >> unsigned long approx_nr_objects; >> >> + uint64_t oids_count_for_progress = 0; >> >> >> >> if (!commit_graph_compatible(the_repository)) >> >> return; >> >> @@ -934,9 +939,10 @@ void write_commit_graph(const char *obj_dir, >> >> if (oids.progress_done < approx_nr_objects) >> >> display_progress(oids.progress, approx_nr_objects); >> >> stop_progress(&oids.progress); >> >> + oids_count_for_progress = oids.nr; >> >> } >> >> >> >> - close_reachable(&oids, report_progress); >> >> + close_reachable(&oids, report_progress, oids_count_for_progress); >> >> >> >> if (report_progress) >> >> progress = start_delayed_progress( >> >>