On Thu, 1 Mar 2007, Ingo Molnar wrote: > > * Ingo Molnar <mingo at elte.hu> wrote: > > > update: f3ccb06f3b8e0cf42b579db21f3ca7f17fcc3f38 works for me too, and > > 01363220f5d23ef68276db8974e46a502e43d01d is broken. I too will attempt > > to bisect this. > > hm. There's some weird bisection artifact here. Here are the commits i > tested, in git-log order: > > #1 commit 01363220f5d23ef68276db8974e46a502e43d01d bad > #2 commit ee404566f97f9254433399fbbcfa05390c7c55f7 bad > #3 commit f3ccb06f3b8e0cf42b579db21f3ca7f17fcc3f38 good > #4 commit c827ba4cb49a30ce581201fd0ba2be77cde412c7 bad Use "git bisect visualize" to see what bisect ends up doing. > if i tell git-bisect that #1 is bad and #3 is good, then it offers me #2 > - that's OK. But when i tell it that #2 is bad, it offers #4 - which is > out of order! No it's not. "git bisect" does exactly the right thing. There is no simple ordering in a complex branch-merge schenario, you can't just put the commits in some "ordering" and test things in time order. That would be totally broken, and idiotic. It doesn't give the right results. What git bisect does is to find the commit that most closely *bisects* the history of commits, so that if it is marked good/bad, it will leave you with about 50% of the commits left. But if you are looking at date order, you're entirely confused. For example, let's take a really simple case a <- bad / \ b c | | d e | | f g \ / h | * <-good and if you are looking to find something "in the middle", you might thing that "d" or "e" are the best choices, since time-wise, they are in the middle. But that's not true AT ALL. If you actually want to bisect that kind of history, you need to choose "b" or "c", even though they may both be *much* more "recent" than the others. Why? Because if you pick "d", you're really only testing three commits ('d' 'f' and 'h') out of the 8 commits you have to test. In contrast, if you pick 'b', you are testing the effects of *four* commits ('b', 'd', 'f' and 'h') and you have thus neatly bisected the commits into two equal groups for testing (one group _with_ those four commits, and one group _without_) instead of having partitioned them as 3 commits vs 5 commits. So please realize that non-linear history very much means that you MUST NOT think that you just pick a commit "in the middle". No, git bisect is a LOT smarter than that - it picks a commit that *reaches* about half the commits you have left to test. > The bisection goes off into la-la land after that and > never gets back to a commit that is /after/ the good commit. How is this > possible? (I upgraded from git-1.4.4 to 1.5.0 to make sure this isnt > some git bug that's already fixed.) It's possible because git knows what it is doing, and you didn't think things through. The commits that "git bisect" picked out are the right ones. Quite often, there may be two or more "equally good" commits (in my example above, you can choose either "b" or "c", and it will bisect the set of untested commits equally well - in two groups of four, but two *different* groups of four commits), and yes, it's possible that git has a bug that makes it pick the wrong ones, but quite frankly, I seriously doubt it. "git bisect" has been very successful indeed, and is generally a *lot* better at picking a commit "in the middle" than people are, exactly because it's quite hard to see which commit "reaches" half the commits if you have lots of merges and branches. Try out git bisect visualize and it will literally show you what it is doing. What can be confusing is that if the "good" and "bad" markers are ON DIFFERENT BRANCHES OF DEVELOPMENT, you may not even *see* the "good" marker, because you may well have something like this: a <- bad | b * <- good | | c d \ / e | f | ... and what do you think "git bisect visualize" will actually show you? Since 'd', 'e' and 'f' are all in the "good" set (they both exist as commits in something leading up to a commit that has already been deemed fine), they aren't *interesting* - they can't be introducing the bug, since if that was the case, the good commit wouldn't have been good. So as far as bisection is concerned, the tree actually looks like a <- bad | b | c | ... and you have just three commits that are potentially interesting: 'a', 'b' and 'c'. Now, with three commits, you cannot test them half-and-half, so you have to test it in groups of 1 vs 2 commits, so it's arbitrary whether you choose 'b' or 'c' to test, but you'd test one of them. Say that you choose 'b', and it turns out to be good. If so, you're done: 'a' is bad and 'b' is good, so the bug was introduced in 'a'. But if it turns out to be bad, you'll still have to test 'c' too, since you don't know if the bug was *introduced* in 'b' or not. See? > i'll try to straighten this out manually Don't. You're just going to make your bisection much less effective. The whole point of bisection is that you can usually cut the number of commits to test pretty exactly in half. If you start mucking with the commits to test, and you don't understand about the reachability graph, you'll just choose a much worse set of commits to test than "git bisect" will do. So learn to trust "git bisect". It really does know what it is doing. Linus