Re: [BUG] v2.16.0-rc0 seg faults when git bisect skip

Martin Ågren <martin.agren@xxxxxxxxx> · Sat, 6 Jan 2018 16:02:34 +0100

On 6 January 2018 at 15:27, Yasushi SHOJI <yasushi.shoji@xxxxxxxxx> wrote:
> best_bisection_sorted() seems to do
>
>  - get the commit list along with the number of elements in the list
>  - walk the list one by one to check whether a element have TREESAME or not
>  - if TREESAME, skip
>  - if not, add it to array
>  - sort the array by distance
>  - put elements back to the list
>
> so, if you find TREESAME, you get less elements than given, right?

All of this matches my understanding.

> Also, if you sort, the last commit, which has NULL in the ->next,
> might get in the middle of the array??

This is a bit tricky. The elements of the linked list are not sorted.
Instead, the items of the linked list are copied into an array and
sorted. The result is then put into a linked list.

Or actually, into the *same* list. That's an optimization to avoid
deallocating all objects in the list, then allocate (roughly) the same
number of objects again. It means all the `next`-pointers are already
set up, and we just need to set the final one to NULL. (It will already
be NULL if and only if the new list has the same length as the old,
i.e., if we didn't skip any TREESAME-commit.)

> # BTW, is it really fast to use QSORT even if you have to convert to
> # an array from list?

You'll find some QSORT/qsort-stuff in git-compat-util.h. We may or may
not end up doing an actual "quick sort". That would depend on, e.g., how
the C-library implements `qsort()`. Also, I'd guess that even if we have
room for an improvement, those cases (small `cnt`!) are already fast
enough that no-one really cares. That is, maybe we could measure a
speed-up, but would anyone really *notice* it?

>> Thank you for providing a script for reproducing this. It helped me come
>> up with the attached patch. The patch is based on ma/bisect-leakfix,
>> which includes Ævar's patch.
>>
>> I think this patch could be useful, either as a final "let's test
>> something previously non-tested; this would have caught the segfault",
>> or simply squashed into Ævar's patch as a "let's add a test that would
>> have caught this, and which also tests a previously non-tested code
>> path."
>
> Do we really need that?  What is a practical use of a commit having
> both good and bad?

There is not much practical use for *having* it. But there might be some
use in *detecting* it. Linus wrote in 670f5fe34f: "Also, add a safety
net: if somebody marks as good something that includes the known-bad
point, we now notice and complain, instead of writing an empty revision
to the new bisection branch."

So there might be some value in verifying that the safety-net works and
tells the user something useful (as opposed to Git segfaulting ;-) ).

Regards,
Martin