Re: [PATCH v2 2/3] bisect: when skipping, choose a commit away from a skipped commit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Christian Couder <chriscool@xxxxxxxxxxxxx> writes:

> For example using the following script:
>
> -------------
> #!/bin/sh
>
> new_commit() {
>     num="$1"
>     echo "line for commit $num" >> hello
>     git commit -m "commit $num" hello
> }
>
> touch hello
> git add hello
>
> for val in $(seq 100)
> do
>   new_commit $val
> done

But isn't this a totally uninteresting case of 100 _single_ strand of
pearls?  For a linear history, you do not even need the bisect machinery;
you can even bisect by hand.

>> Imagine you have a graph like this (G is good, B is bad, and both
>> branches have similar number of commits):
>>
>> 	G-------------y--z--a--b--c--B
>>                               /
>>         G--------------f--e--d
>>
>> In the above graph, a and d would give the best bisection, because they
>> can reach about half the number of nodes in the entire graph, and the
>> goal of the basic "find_bisection()" is to find a commit that cuts the
>> graph into closest-to-the-equal halves.  For this reason, 'a' and 'd'
>> would be near the beginning of the output from find_bisection(find_all)
>> you run when there are untestable commits in your managed_skipped().
>>
>> Suppose 'd' was already marked untestable, but 'a' is not.  And 'd' has
>> slightly better "goodness" value than 'a'.
>>
>> 	Side note.  I do not think the situation should change drastically
>> 	if 'a' has a better "goodness" value than 'd' has, but your
>> 	"skipped_first" logic that I already said I do not understand in
>> 	my earlier comment treats them quite differently, so this example
>> 	explicitly talks about the case where 'd' is better than 'a'.
>>
>> You do not check 'a' but check somewhere much older, either on the top
>> branch or on the bottom branch.  Why?
>
> The reason is that it would make the code more complex to check that we are 
> in this case, and that this case may not happen very often (it relies on 
> both being near a merge and having branches of the same size), and that we 
> don't loose much (see above my reference to what HPA wrote) by testing a 
> commit quite far away.

I think what you are missing is that you are not even guaranteeing "quite
far away" in the _topological_ space, especially in the presense of merges.

If you focus on a totally linear history, yes, picking somewhere away, in
the "goodness" scale space, from the optimal point (and there is a single
optimal point) that is untestable happens to take you away from that
particular untestable one in the topology space as well.  But that only
holds true when you have no merges.

My example graph was not an extreme special case at all.  It is just an
illustration of a typical real-life history.  Sure, I told you to assume
that both sides have about equal number of commits, but if the top branch
were longer, then instead of 'a', perhaps 'y' or its parent may be the
next best commit.  It is still very close in the "goodness" space to the
untestable commit 'd', but it is very far away from it in the topological
space, but your algorithm discards it because it is close in the
"goodness" scale space.  And the distance in the topological space from
untestable ones is what we seek here.

>>  - The point chosen should be far from any known untestable commits.  We
>>    do not currently have function to count distance from untestable
>>    commits, nor make a call to such a function after filtering untestable
>>    commits, but I think we should.
>
> I think this should be the case with my patch series.

Why?  You are picking "away, in the goodness scale, from the _best_ one".
I've already explained why it does not follow that the commit chosen that
way is _topologically_ away from _untestable_ ones.

> I agree that we could probably analyse the graph much better than what my 
> patch series does, but I think that it would be quite complex.

If you do not want complexity, let's not even do this "away in the
goodness space from the best one".  Your three patches already add
complexity to the algorithm, and I do not think what they do is worth it.
It solves a wrong problem, that does not have anything to do with "stay
away from untestable ones in the topological space".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]