Re: [bug] "git bisect old v3.0" takes 21 mins on Linux repo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 16, 2025 at 05:52:46AM -0500, Jeff King wrote:

> That clear_distance() call likewise iterates through the list to clear
> the COUNTED flags from each. I guess we might be able to traverse down
> from the tip of the commit we're operating on, clearing flags there.
> Since that's how the flags are set in count_distance().
> 
> I suspect it's still quadratic, though, because count_distance() is
> traversing separately for each (and in the worst case everything is
> reachable from it). But it might still improve things in practice.

Apparently I'm good at suspecting. Here's a patch to make
clear_distance() walk the same commits as count_distance(), including
the string-of-pearls recursion avoidance:

diff --git a/bisect.c b/bisect.c
index 1a9069c9ad..ecf656316b 100644
--- a/bisect.c
+++ b/bisect.c
@@ -69,12 +69,28 @@ static int count_distance(struct commit_list *entry)
 	return nr;
 }
 
-static void clear_distance(struct commit_list *list)
+static void clear_distance(struct commit_list *entry)
 {
-	while (list) {
-		struct commit *commit = list->item;
+	while (entry) {
+		struct commit *commit = entry->item;
+		struct commit_list *p;
+
+		if (commit->object.flags & UNINTERESTING)
+			break;
+		if (!(commit->object.flags & COUNTED))
+			break;
+
 		commit->object.flags &= ~COUNTED;
-		list = list->next;
+
+		p = commit->parents;
+		entry = p;
+		if (p) {
+			p = p->next;
+			while (p) {
+				clear_distance(p);
+				p = p->next;
+			}
+		}
 	}
 }
 
@@ -338,7 +354,7 @@ static struct commit_list *do_find_bisection(struct commit_list *list,
 		if (bisect_flags & FIND_BISECTION_FIRST_PARENT_ONLY)
 			BUG("shouldn't be calling count-distance in fp mode");
 		weight_set(p, count_distance(p));
-		clear_distance(list);
+		clear_distance(p);
 
 		/* Does it happen to be at half-way? */
 		if (!(bisect_flags & FIND_BISECTION_ALL) &&

It cuts Askar's case on my machine from 16m51s to 9m34s. So a big
improvement but still...not great.

I suspect that the whole bisection count algorithm needs to be rewritten
to all run in a single traversal. I guess if you iterate over the
commits in reverse-topo order, you should be able to just compute each
distance as "d(commit) = 1; d(commit) += d(p) for parents(commit)". But
it's not a problem I've thought a lot about, so I'm probably missing
some subtlety.

At any rate, an easier way to time this is:

  git rev-list --bisect v3.0..v6.13-rc7

which is the expensive part of what git-bisect is doing under the hood.

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux