On Wed, Oct 10 2018, SZEDER Gábor wrote: > On Mon, Sep 17, 2018 at 03:33:35PM +0000, Ævar Arnfjörð Bjarmason wrote: >> $ git -c gc.writeCommitGraph=true gc >> [...] >> Annotating commits in commit graph: 1565573, done. >> Computing commit graph generation numbers: 100% (782484/782484), done. > > While poking around 'commit-graph.c' in my Bloom filter experiment, I > saw similar numbers like above, and was confused by the much higher > than expected number of annotated commits. It's about twice as much > as the number of commits in the repository, or the number shown on the > very next line. > >> diff --git a/commit-graph.c b/commit-graph.c >> index 8a1bec7b8a..2c5d996194 100644 >> --- a/commit-graph.c >> +++ b/commit-graph.c >> -static void close_reachable(struct packed_oid_list *oids) >> +static void close_reachable(struct packed_oid_list *oids, int report_progress) >> { >> int i; >> struct commit *commit; >> + struct progress *progress = NULL; >> + int j = 0; >> >> + if (report_progress) >> + progress = start_delayed_progress( >> + _("Annotating commits in commit graph"), 0); >> for (i = 0; i < oids->nr; i++) { >> + display_progress(progress, ++j); >> commit = lookup_commit(the_repository, &oids->list[i]); >> if (commit) >> commit->object.flags |= UNINTERESTING; >> @@ -604,6 +616,7 @@ static void close_reachable(struct packed_oid_list *oids) >> * closure. >> */ >> for (i = 0; i < oids->nr; i++) { >> + display_progress(progress, ++j); >> commit = lookup_commit(the_repository, &oids->list[i]); >> >> if (commit && !parse_commit(commit)) >> @@ -611,19 +624,28 @@ static void close_reachable(struct packed_oid_list *oids) >> } > > The above loops have already counted all the commits, and, more > importantly, did all the hard work that takes time and makes the > progress indicator useful. > >> for (i = 0; i < oids->nr; i++) { >> + display_progress(progress, ++j); [...] > This display_progress() call, however, doesn't seem to be necessary. > First, it counts all commits for a second time, resulting in the ~2x > difference compared to the actual number of commits, and then causing > my confusion. Second, all what this loop is doing is setting a flag > in commits that were already looked up and parsed in the above loops. > IOW this loop is very fast, and the progress indicator jumps from > ~780k right to 1.5M, even on my tiny laptop, so it doesn't need a > progress indicator at all. You're right, I tried this patch on top: diff --git a/commit-graph.c b/commit-graph.c index a686758603..cccd83de72 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -655,12 +655,16 @@ static void close_reachable(struct packed_oid_list *oids, int report_progress) if (commit) commit->object.flags |= UNINTERESTING; } + stop_progress(&progress); j = 0; /* * As this loop runs, oids->nr may grow, but not more * than the number of missing commits in the reachable * closure. */ + if (report_progress) + progress = start_delayed_progress( + _("Annotating commits in commit graph 2"), 0); for (i = 0; i < oids->nr; i++) { display_progress(progress, ++j); commit = lookup_commit(the_repository, &oids->list[i]); @@ -668,7 +672,11 @@ static void close_reachable(struct packed_oid_list *oids, int report_progress) if (commit && !parse_commit(commit)) add_missing_parents(oids, commit); } + stop_progress(&progress); j = 0; + if (report_progress) + progress = start_delayed_progress( + _("Annotating commits in commit graph 3"), 0); for (i = 0; i < oids->nr; i++) { display_progress(progress, ++j); commit = lookup_commit(the_repository, &oids->list[i]); And on a large repo with around 3 million commits the 3rd progress bar doesn't kick in. But if I apply this on top: diff --git a/progress.c b/progress.c index 5a99c9fbf0..89cc705bf7 100644 --- a/progress.c +++ b/progress.c @@ -58,8 +58,8 @@ static void set_progress_signal(void) sa.sa_flags = SA_RESTART; sigaction(SIGALRM, &sa, NULL); - v.it_interval.tv_sec = 1; - v.it_interval.tv_usec = 0; + v.it_interval.tv_sec = 0; + v.it_interval.tv_usec = 250000; v.it_value = v.it_interval; setitimer(ITIMER_REAL, &v, NULL); } I.e. start the timer after 1/4 of a second instead of 1 second, I get that progress bar. So I'm inclined to keep it. It just needs to be 4x the size before it's noticeably hanging for 1 second. That repo isn't all that big compared to what we've heard about out there, and inner loops like this have a tendency to accumulate some more code over time without a re-visit of why we weren't monitoring progress there. But maybe we can fix the message. We say "Annotating commits in commit grap", not "Counting" or whatever. I was trying to find something that didn't imply that we were doing this once. One can annotate a thing more than once, but maybe ther's a better way to explain this... We had some more accurate progress reporting in close_reachable(), discussed in https://public-inbox.org/git/87efe5qqks.fsf@xxxxxxxxxxxxxxxxxxx/ I still think the *main* use-case for these things is to just report that we're not hanging, so maybe the proper solution is to pick up Duy's patch to display a spinner insted of a numeric progress. >> commit = lookup_commit(the_repository, &oids->list[i]); >> >> if (commit) >> commit->object.flags &= ~UNINTERESTING; >> } >> + stop_progress(&progress); >> }