Re: [PATCH v3 1/2] commit-graph write: add progress output

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 10 2018, SZEDER Gábor wrote:

> On Mon, Sep 17, 2018 at 03:33:35PM +0000, Ævar Arnfjörð Bjarmason wrote:
>>     $ git -c gc.writeCommitGraph=true gc
>>     [...]
>>     Annotating commits in commit graph: 1565573, done.
>>     Computing commit graph generation numbers: 100% (782484/782484), done.
>
> While poking around 'commit-graph.c' in my Bloom filter experiment, I
> saw similar numbers like above, and was confused by the much higher
> than expected number of annotated commits.  It's about twice as much
> as the number of commits in the repository, or the number shown on the
> very next line.
>
>> diff --git a/commit-graph.c b/commit-graph.c
>> index 8a1bec7b8a..2c5d996194 100644
>> --- a/commit-graph.c
>> +++ b/commit-graph.c
>> -static void close_reachable(struct packed_oid_list *oids)
>> +static void close_reachable(struct packed_oid_list *oids, int report_progress)
>>  {
>>  	int i;
>>  	struct commit *commit;
>> +	struct progress *progress = NULL;
>> +	int j = 0;
>>
>> +	if (report_progress)
>> +		progress = start_delayed_progress(
>> +			_("Annotating commits in commit graph"), 0);
>>  	for (i = 0; i < oids->nr; i++) {
>> +		display_progress(progress, ++j);
>>  		commit = lookup_commit(the_repository, &oids->list[i]);
>>  		if (commit)
>>  			commit->object.flags |= UNINTERESTING;
>> @@ -604,6 +616,7 @@ static void close_reachable(struct packed_oid_list *oids)
>>  	 * closure.
>>  	 */
>>  	for (i = 0; i < oids->nr; i++) {
>> +		display_progress(progress, ++j);
>>  		commit = lookup_commit(the_repository, &oids->list[i]);
>>
>>  		if (commit && !parse_commit(commit))
>> @@ -611,19 +624,28 @@ static void close_reachable(struct packed_oid_list *oids)
>>  	}
>
> The above loops have already counted all the commits, and, more
> importantly, did all the hard work that takes time and makes the
> progress indicator useful.
>
>>  	for (i = 0; i < oids->nr; i++) {
>> +		display_progress(progress, ++j);

[...]

> This display_progress() call, however, doesn't seem to be necessary.
> First, it counts all commits for a second time, resulting in the ~2x
> difference compared to the actual number of commits, and then causing
> my confusion.  Second, all what this loop is doing is setting a flag
> in commits that were already looked up and parsed in the above loops.
> IOW this loop is very fast, and the progress indicator jumps from
> ~780k right to 1.5M, even on my tiny laptop, so it doesn't need a
> progress indicator at all.

You're right, I tried this patch on top:

    diff --git a/commit-graph.c b/commit-graph.c
    index a686758603..cccd83de72 100644
    --- a/commit-graph.c
    +++ b/commit-graph.c
    @@ -655,12 +655,16 @@ static void close_reachable(struct packed_oid_list *oids, int report_progress)
     		if (commit)
     			commit->object.flags |= UNINTERESTING;
     	}
    +	stop_progress(&progress); j = 0;

     	/*
     	 * As this loop runs, oids->nr may grow, but not more
     	 * than the number of missing commits in the reachable
     	 * closure.
     	 */
    +	if (report_progress)
    +		progress = start_delayed_progress(
    +			_("Annotating commits in commit graph 2"), 0);
     	for (i = 0; i < oids->nr; i++) {
     		display_progress(progress, ++j);
     		commit = lookup_commit(the_repository, &oids->list[i]);
    @@ -668,7 +672,11 @@ static void close_reachable(struct packed_oid_list *oids, int report_progress)
     		if (commit && !parse_commit(commit))
     			add_missing_parents(oids, commit);
     	}
    +	stop_progress(&progress); j = 0;

    +	if (report_progress)
    +		progress = start_delayed_progress(
    +			_("Annotating commits in commit graph 3"), 0);
     	for (i = 0; i < oids->nr; i++) {
     		display_progress(progress, ++j);
     		commit = lookup_commit(the_repository, &oids->list[i]);

And on a large repo with around 3 million commits the 3rd progress bar
doesn't kick in.

But if I apply this on top:

    diff --git a/progress.c b/progress.c
    index 5a99c9fbf0..89cc705bf7 100644
    --- a/progress.c
    +++ b/progress.c
    @@ -58,8 +58,8 @@ static void set_progress_signal(void)
     	sa.sa_flags = SA_RESTART;
     	sigaction(SIGALRM, &sa, NULL);

    -	v.it_interval.tv_sec = 1;
    -	v.it_interval.tv_usec = 0;
    +	v.it_interval.tv_sec = 0;
    +	v.it_interval.tv_usec = 250000;
     	v.it_value = v.it_interval;
     	setitimer(ITIMER_REAL, &v, NULL);
     }

I.e. start the timer after 1/4 of a second instead of 1 second, I get
that progress bar.

So I'm inclined to keep it. It just needs to be 4x the size before it's
noticeably hanging for 1 second.

That repo isn't all that big compared to what we've heard about out
there, and inner loops like this have a tendency to accumulate some more
code over time without a re-visit of why we weren't monitoring progress
there.

But maybe we can fix the message. We say "Annotating commits in commit
grap", not "Counting" or whatever. I was trying to find something that
didn't imply that we were doing this once. One can annotate a thing more
than once, but maybe ther's a better way to explain this...

We had some more accurate progress reporting in close_reachable(),
discussed in
https://public-inbox.org/git/87efe5qqks.fsf@xxxxxxxxxxxxxxxxxxx/ I still
think the *main* use-case for these things is to just report that we're
not hanging, so maybe the proper solution is to pick up Duy's patch to
display a spinner insted of a numeric progress.

>>  		commit = lookup_commit(the_repository, &oids->list[i]);
>>
>>  		if (commit)
>>  			commit->object.flags &= ~UNINTERESTING;
>>  	}
>> +	stop_progress(&progress);
>>  }



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux