Re: [PATCH] bisect: loosen halfway() check for a large number of commits

SZEDER Gábor <szeder.dev@xxxxxxxxx> · Sun, 25 Oct 2020 19:01:36 +0100

On Sat, Oct 24, 2020 at 09:41:27AM +0200, Christian Couder wrote:
> On Thu, Oct 22, 2020 at 8:20 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
> >
> > SZEDER Gábor <szeder.dev@xxxxxxxxx> writes:
> >
> > > However, when we have thousands of commits it's not all that important
> > > to find the _exact_ halfway point, a few commits more or less doesn't
> > > make any real difference for the bisection.
> >
> > Cute idea.
> 
> I like the idea too.
> 
> > > So I ran some tests to see how often that happens: picked random good
> > > and bad starting revisions at least 50k commits apart and a random
> > > first bad commit in between in git.git, and used 'git bisect run git
> > > merge-base --is-ancestor HEAD $first_bad_commit' to check the number
> > > of necessary bisection steps.  After repeating all this 1000 times
> > > both with and without this patch I found that:
> > >
> > >   - 146 cases needed one more bisection step than before, 149 cases
> > >     needed one less step, while in the remaining 705 cases the number
> > >     of steps didn't change.  So the number of bisection steps does
> > >     indeed change in a non-negligible number of cases, but it seems
> > >     that the average number of steps doesn't change in the long run.
> >
> > It somehow is a bit surprising that there are cases that need fewer
> > steps, but I guess that is how rounding-error cuts both ways?
> 
> When there are 50k commits span between the initial good and bad, I
> don't expect to see any statistically significant result by trying it
> 1k times only. My guess is that you might start seeing something
> significant only when the number of tries is a multiple of the span
> between the initial good and bad.

Well, perhaps...  but statistically relevant or not, running those
1000 tests I reported about took over 6.5 hours, so that's all you'll
get from me :)

Btw, just for curiosity, running just _one_ similar test in linux.git
with the good-bad range containing ~830k commits took ~65 minutes, and
the runtime of 'git bisect start' went from ~38mins to ~12.