Re: [PATCH v2 18/21] bisect: prepare for different algorithms based on find_all

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Stephan Beyer <s-beyer@xxxxxxx> writes:

> This is a preparation commit with copy-and-paste involved.
> The function do_find_bisection() is changed and copied to
> two almost similar functions compute_all_weights() and
> compute_relevant_weights().
>
> The function compute_relevant_weights() stops when a
> "halfway" commit is found.
>
> To keep the code clean, the halfway commit is not returned
> and has to be found by best_bisection() afterwards.
> This results in a singular additional O(#commits)-time
> overhead but this will be outweighed by the following
> changes to compute_relevant_weights().
>
> It is necessary to keep compute_all_weights() for the
> "git rev-list --bisect-all" command. All other bisect-related
> commands will use compute_relevant_weights().
>
> Signed-off-by: Stephan Beyer <s-beyer@xxxxxxx>
> ---
>  bisect.c | 116 ++++++++++++++++++++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 97 insertions(+), 19 deletions(-)
>
> diff --git a/bisect.c b/bisect.c
> index a254f28..c6bad43 100644
> --- a/bisect.c
> +++ b/bisect.c
> @@ -179,6 +179,7 @@ static struct commit_list *best_bisection(struct commit_list *list)
>  		}
>  	}
>  
> +	best->next = NULL;
>  	return best;
>  }

At this point of this patch it is unclear how this change is
relevant.  I'll read on without complaining but with puzzlement.

> @@ -245,9 +246,8 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list)
>   * unknown.  After running compute_weight() first, they will get zero
>   * or positive distance.
>   */
> -static struct commit_list *do_find_bisection(struct commit_list *list,
> -					     struct node_data *weights,
> -					     int find_all)
> +static void compute_all_weights(struct commit_list *list,
> +				struct node_data *weights)
>  {
>  	int n, counted;
>  	struct commit_list *p;
> @@ -301,10 +301,88 @@ static struct commit_list *do_find_bisection(struct commit_list *list,
>  		if (!(p->item->object.flags & UNINTERESTING)
>  		 && (node_data(p->item)->weight == -2)) {
>  			compute_weight(p->item);
> +			counted++;
> +		}
> +	}
> +
> +	show_list("bisection 2 compute_weight", counted, list);
> +
> +	while (counted < total) {
> +		for (p = list; p; p = p->next) {
> +			struct commit_list *q;
> +			unsigned flags = p->item->object.flags;
> +
> +			if (0 <= node_data(p->item)->weight)
> +				continue;
> +			for (q = p->item->parents; q; q = q->next) {
> +				if (q->item->object.flags & UNINTERESTING)
> +					continue;
> +				if (0 <= node_data(q->item)->weight)
> +					break;
> +			}
> +			if (!q)
> +				continue;
> +
> +			/*
> +			 * weight for p is unknown but q is known.
> +			 * add one for p itself if p is to be counted,
> +			 * otherwise inherit it from q directly.
> +			 */

This is not a new problem, but "q" in the comment should be
"q->item"; my bad.

> +			node_data(p->item)->weight = node_data(q->item)->weight;
> +			if (!(flags & TREESAME)) {
> +				node_data(p->item)->weight++;
> +				counted++;
> +				show_list("bisection 2 count one",
> +					  counted, list);
> +			}

This is not a new problem, and I do admit that I wrote the original
at 1daa09d9 (make the previous optimization work also on
path-limited rev-list --bisect, 2007-03-23), but this makes me
wonder why counted++ is not done for p that is treesame.

 ... goes and looks ...

Ahh, the original while loop was counting upto "nr", which is
different from total.  When the traversal is pathspec limited, I
counted in the original (and probably in 'master' branch before your
series) the number of tree-changing commits in 'nr' which can be
smaller than 'total'.  That is why skipping counted++ on a treesame
commit was the correct thing to do.

Is it possible that you did not test --bisect-all with pathspec?  I
have a suspicion that the above loop would not terminate because of
this change.  If that is the case, either "make total global and get
rid of nr" needs to be fixed, or (probably better) move counted++
out of this if statement.  At this point, counted indicates "how
many commits out of 'total' we have computed weight for?", so the
latter would make more sense to me, even though either is OK.

The "priming the well" step for the "no interesting parent--set
weight to either 0 or 1" also needs to adjust counted++, I suspect.

> +		}
> +	}
> +	show_list("bisection 2 counted all", counted, list);
> +}
> +
> +/* At the moment this is basically the same as compute_all_weights()
> + * but with a halfway shortcut */

/*
 * We write our multi-line comments like this.
 * The first and last lines have only asterisk
 * and slash.
 */

> +static void compute_relevant_weights(struct commit_list *list,
> +				     struct node_data *weights)
> +{
> +	int n, counted;
> +	struct commit_list *p;
> +
> +	counted = 0;
> +
> +	for (n = 0, p = list; p; p = p->next) {
> +		struct commit *commit = p->item;
> +		unsigned flags = commit->object.flags;
> +
> +		commit->util = &weights[n++];
> +		switch (count_interesting_parents(commit)) {
> +		case 0:
> +			if (!(flags & TREESAME)) {
> +				node_data(commit)->weight = 1;
> +				counted++;
> +				show_list("bisection 2 count one",
> +					  counted, list);
> +			}
> +			break;
> +		case 1:
> +			node_data(commit)->weight = -1;
> +			break;
> +		default:
> +			node_data(commit)->weight = -2;
> +			break;
> +		}
> +	}
> +
> +	show_list("bisection 2 initialize", counted, list);
> +
> +	for (p = list; p; p = p->next) {
> +		if (!(p->item->object.flags & UNINTERESTING)
> +		 && (node_data(p->item)->weight == -2)) {
> +			compute_weight(p->item);
>  
>  			/* Does it happen to be at exactly half-way? */
> -			if (!find_all && halfway(p->item))
> -				return p;
> +			if (halfway(p->item))
> +				return;

It is somewhat curious why this, after finding the desired commit as
p->item, does not return it.  If it is already known that we have
half-way one, can't we return it immediately (and when we fall
through without finding exactly the half-way one, we signal the
caller that it needs best_bisection() among the list by returning a
NULL or something?

But probably that is optimizing it prematurely.  I dunno.

>  			counted++;
>  		}
>  	}
> @@ -341,17 +419,11 @@ static struct commit_list *do_find_bisection(struct commit_list *list,
>  			}
>  
>  			/* Does it happen to be at exactly half-way? */
> -			if (!find_all && halfway(p->item))
> -				return p;
> +			if (halfway(p->item))
> +				return;
>  		}
>  	}
> -
>  	show_list("bisection 2 counted all", counted, list);
> -
> -	if (!find_all)
> -		return best_bisection(list);
> -	else
> -		return best_bisection_sorted(list);
>  }
>  
>  struct commit_list *find_bisection(struct commit_list *list,
> @@ -365,6 +437,9 @@ struct commit_list *find_bisection(struct commit_list *list,
>  	total = 0;
>  	marker = 0;
>  
> +	if (!list)
> +		return NULL;
> +
>  	show_list("bisection 2 entry", 0, list);
>  
>  	/*
> @@ -391,13 +466,16 @@ struct commit_list *find_bisection(struct commit_list *list,
>  	*all = total;
>  	weights = (struct node_data *)xcalloc(on_list, sizeof(*weights));
>  
> -	/* Do the real work of finding bisection commit. */
> -	best = do_find_bisection(list, weights, find_all);
> -	if (best) {
> -		if (!find_all)
> -			best->next = NULL;
> -		*reaches = node_data(best->item)->weight;
> +	if (find_all) {
> +		compute_all_weights(list, weights);
> +		best = best_bisection_sorted(list);
> +	} else {
> +		compute_relevant_weights(list, weights);
> +		best = best_bisection(list);
>  	}
> +	assert(best);
> +	*reaches = node_data(best->item)->weight;
> +
>  	free(weights);
>  
>  	return best;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]