Re: first bisection step takes quite a while

Junio C Hamano <gitster@xxxxxxxxx> · Thu, 20 Feb 2025 17:40:53 -0800

Junio C Hamano <gitster@xxxxxxxxx> writes:

> "D. Ben Knoble" <ben.knoble@xxxxxxxxx> writes:
>
>> On Thu, Feb 20, 2025 at 9:38 AM Uwe Kleine-König
>> <u.kleine-koenig@xxxxxxxxxxxx> wrote:
>>>
>>> Hello,
>>>
>>> today I did a bisection in the kernel repository:
>>>
>>>         linux$ git version
>>>         git version 2.47.1
>>>
>>>         linux$ time git bisect start 09fbf3d502050282bf47ab3babe1d4ed54dd1fd8 96d8eab5d0a1a9741a4cae1b3c125d75d1aabedf
>>>         Bisecting: 572238 revisions left to test after this (roughly 19 steps)
>>>         [eafdca4d7010a0e019aaaace3dd71b432a69b54c] Merge tag 'staging-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
>>>
>>>         real    18m41.374s
>>>         user    27m18.306s
>>>         sys     1m0.565s
>>>
>>> I was surprised that it took that long to find and checkout the first
>>> revision to check. (That is on a 4 x Intel(R) Core(TM) i5-6440HQ CPU @
>>> 2.60GHz, 16 GiB RAM with a Samsung SSD. On a different machine (56 x
>>> Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz, 256 GiB RAM and (I think a
>>> spinning hard disk)) it took nearly an hour.
>>
>> Related thread:
>> https://lore.kernel.org/git/19461b87a5c.5a2ea74016716.8214238482389812984@xxxxxxxxxxxx/
>
> Indeed.  I haven't had a chance to dig any deeper in the area of the
> code since that discussion, but the ideas raised in the messages
> near the tail end of the thread may be worth exploring.
>
> THanks for the link.

So, here is something that _could_ be the beginning of a patch, but
just to illustrate what I tried.

 * In do_find_bisection(), we try each commit on the incoming commit
   list (which is sorted the way rev-list emits, probably reversed)
   and count how many commits in the set each merge commit can reach
   (which is called "weight") in the "honest and stupid" way.  I try
   to collect these merges in a linear array, and try from the
   middle to older and newer.  As the loop to compute weight for
   merges have an early-exit clause that says "oh, this is good
   enough", this may improve our odds to find a good enough merge
   early.

 * The "this is good enough" logic currently allows us to be within
   0.1% of the real halfway point.  Until the candidate set becomes
   small enough, we could loosen the criteria to allow larger, say
   3%, slack.  This code is written but not enabled (with "0 &&").

 * After computing the weight for a merge in "honest and stupid"
   way, we know what other commits in the set it can reach.  If the
   weight we computed is way smaller than the half the number of
   commits in the set, that means these commits we can reach from
   the merge we are looking at would score even lower.  We could
   mark them as not-viable before clearing the list to check next
   merge with "honest and stupid" way.  Again, this code is written
   but not enabled.

So, in short, I have three ideas, and with the first one (that
is the most straightforward and least error prone) alone, it seems
that we gain significant speedup.

The current code took ~20 minutes for me and its result is

$ git bisect start --no-checkout 09fbf3d5020 96d8eab5d0a1
Bisecting: 581164 revisions left to test after this (roughly 19 steps)
[2c71ab4bb465c79a4687cc2fd5012e470aebdb1f] Merge branch 'for-upstre...

There are 1144459 commits in the range, and the point chosen by
bisection can reach 563294 of them.  563294*2 == 1126588, so we are
1144459 - 1126588 = 17871 commits away from the theoretical halfway.

With the "let's try from the midway merges" approach without
changing anything else, I get a different commit (because the
original algorithm is taking "good enough" early exit), and it took
about 30 seconds.

$ git bisect start --no-checkout 09fbf3d5020 96d8eab5d0a1
Bisecting: 572238 revisions left to test after this (roughly 19 steps)
[eafdca4d7010a0e019aaaace3dd71b432a69b54c] Merge tag 'staging-4.18-...

The size of the original range is the same, of course, 1144459
commits, and the point chosen by bisection reaches 572220 of them.
Since 572220*2 = 1144440, we are 1144459 - 1144440 = 19 commits from
the theoretical halfway.

My current thinking is that the heuristics #1 (which is enabled in
my experiment and in the following patch) is good enough, #2
(loosening the "good enough" threshold) is probably not very
effective, and #3 (discard ones that are closer to the good end than
a merge that is known to be not viable) might be interesting to
pursue further but probably tricky to get right.

Comments?


 bisect.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 81 insertions(+), 8 deletions(-)

diff --git c/bisect.c w/bisect.c
index 7a3c77c6d8..eae8e97958 100644
--- c/bisect.c
+++ w/bisect.c
@@ -23,6 +23,7 @@
 #include "object-store-ll.h"
 #include "path.h"
 #include "dir.h"
+#include "trace2.h"
 
 static struct oid_array good_revs;
 static struct oid_array skipped_revs;
@@ -132,8 +133,10 @@ static inline int approx_halfway(struct commit_list *p, int nr)
 		 * good enough if it's within ~0.1% of the halfway point,
 		 * e.g. 5000 is exactly halfway of 10000, but we consider
 		 * the values [4996, 5004] as halfway as well.
+		 * While we have really large number of commits, we'll
+		 * loosen our target to hit within 3% of the harfway.
 		 */
-		if (abs(diff) < nr / 1024)
+		if ((0 && 10000 < nr && abs(diff) < nr / 64) || abs(diff) < nr / 1024)
 			return 1;
 		return 0;
 	}
@@ -282,11 +285,15 @@ static struct commit_list *do_find_bisection(struct commit_list *list,
 					     int nr, int *weights,
 					     unsigned bisect_flags)
 {
-	int n, counted;
+	int n, counted, num_merges;
 	struct commit_list *p;
+	struct commit_list **merge; /* ugh */
 
+	num_merges = 0;
 	counted = 0;
+	num_merges = 0;
 
+	trace2_region_enter("bisect", "do_find_bisection_0", the_repository);
 	for (n = 0, p = list; p; p = p->next) {
 		struct commit *commit = p->item;
 		unsigned commit_flags = commit->object.flags;
@@ -309,16 +316,30 @@ static struct commit_list *do_find_bisection(struct commit_list *list,
 			weight_set(p, -1);
 			break;
 		default:
+			num_merges++;
 			weight_set(p, -2);
 			break;
 		}
 	}
+	trace2_region_leave("bisect", "do_find_bisection_0", the_repository);
 
 	show_list("bisection 2 initialize", counted, nr, list);
 
+	/*
+	 * Collect merges into an array.
+	 */
+	CALLOC_ARRAY(merge, num_merges);
+	for (n = 0, p = list; p; p = p->next) {
+		if (weight(p) != -2)
+			continue;
+		merge[n++] = p;
+	}
+	if (num_merges != n)
+		BUG("Whoa!");
+
 	/*
 	 * If you have only one parent in the resulting set
-	 * then you can reach one commit more than that parent
+	 * then you can reach one commit more than your parent
 	 * can reach.  So we do not have to run the expensive
 	 * count_distance() for single strand of pearls.
 	 *
@@ -330,25 +351,73 @@ static struct commit_list *do_find_bisection(struct commit_list *list,
 	 * So we will first count distance of merges the usual
 	 * way, and then fill the blanks using cheaper algorithm.
 	 */
-	for (p = list; p; p = p->next) {
-		if (p->item->object.flags & UNINTERESTING)
-			continue;
+	trace2_region_enter("bisect", "do_find_bisection_1", the_repository);
+
+	/*
+	 * Use the element of a list from its midpoint.
+	 * (num_merges == 1) mid = 0; ix = 0
+	 * (num_merges == 2) mid = 0; ix = 0, 1
+	 * (num_merges == 3) mid = 1; ix = 1, 2, 0
+	 * (num_merges == 4) mid = 1; ix = 1, 2, 0, 3
+	 * (num_merges == 5) mid = 2; ix = 2, 3, 1, 4, 0
+	 * (num_merges == 6) mid = 2; ix = 2, 3, 1, 4, 0, 5
+	 */
+	for (n = 0; n < num_merges; n++) {
+		struct commit_list *p;
+		int ix = (num_merges - 1) / 2;
+
+		if (n % 2)
+			ix += (n + 1) / 2;
+		else
+			ix -= n / 2;
+
+		p = merge[ix];
 		if (weight(p) != -2)
 			continue;
 		if (bisect_flags & FIND_BISECTION_FIRST_PARENT_ONLY)
 			BUG("shouldn't be calling count-distance in fp mode");
 		weight_set(p, count_distance(p));
+
+#if 0
+		/* 
+		 * If the current merge can reach way fewer than half
+		 * the commits in the graph, any of its ancestors can
+		 * reach even fewer commits, which means they will not
+		 * make better half-way candidate than this one.
+		 *
+		 * Just as a slack, let's cut at 3/8 not exactly 1/2.
+		 */
+		if (weight(p) * 8 < nr * 3) {
+			for (struct commit_list *q = list; q; q = q->next) {
+				if (q->item->object.flags & UNINTERESTING)
+					continue;
+				if (!(q->item->object.flags & COUNTED))
+					continue;
+				if (weight(q) != -2)
+					continue;
+				/* mark it as not a viable candidate */
+				weight_set(q, 1);
+			}
+		}
+#endif
 		clear_distance(list);
 
 		/* Does it happen to be at half-way? */
 		if (!(bisect_flags & FIND_BISECTION_ALL) &&
-		      approx_halfway(p, nr))
+		    approx_halfway(p, nr)) {
+			trace2_data_string("bisect", the_repository, "early-exit", "do_find_bisection_1");
+			trace2_region_leave("bisect", "do_find_bisection_1", the_repository);
+			free(merge);
 			return p;
+		}
 		counted++;
 	}
+	trace2_region_leave("bisect", "do_find_bisection_1", the_repository);
+	free(merge);
 
 	show_list("bisection 2 count_distance", counted, nr, list);
 
+	trace2_region_enter("bisect", "do_find_bisection_2", the_repository);
 	while (counted < nr) {
 		for (p = list; p; p = p->next) {
 			struct commit_list *q;
@@ -384,10 +453,14 @@ static struct commit_list *do_find_bisection(struct commit_list *list,
 
 			/* Does it happen to be at half-way? */
 			if (!(bisect_flags & FIND_BISECTION_ALL) &&
-			      approx_halfway(p, nr))
+			    approx_halfway(p, nr)) {
+			trace2_data_string("bisect", the_repository, "early-exit", "do_find_bisection_2");
+				trace2_region_leave("bisect", "do_find_bisection_2", the_repository);
 				return p;
+			}
 		}
 	}
+	trace2_region_leave("bisect", "do_find_bisection_2", the_repository);
 
 	show_list("bisection 2 counted all", counted, nr, list);