On Sun, Mar 17, 2019 at 9:41 PM Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Sun, Mar 17, 2019 at 10:12 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote: > > > > Please see https://github.com/google/syzkaller/blob/master/docs/syzbot.md#bisection > > it should answer all of your questions. It does 2 and more. > > And in this case it seems to be working as intended bisecting it to a > > release tag. > > No, it's definitely not working as intended. > > You can see it in the bisect log - you don't actually have a single > "git bisect bad" outside of the initial one that you start bisecting > with. That's a pretty good sign of bisection being completely broken. > Yes, it can happen in theory, but in general with a good bisection, > you should see about as many "good" results as "bad". > > I bet that what's going on is that your initial "let's test every > release" uses a _different_ process than the actual bisection itself > does. > > So if I were you, I'd look at what syzbot does differently during > bisection vs what it does for that initial "test each release". For > example, does it do "make clean" in between each build in one case, > but not the other? Does it do "make oldconfig" vs a fixed config > generated from scratch every time? Because the fact that you first > tested 4.10 bad using the "test each release", and then when you do > bisection, the very commit *before* 4.10 is good (the only difference > being the EXTRAVERSION and the tag) shows that something went wrong. Well, this is intended behavior for some definition of intended. The root cause of what happened here is that syzbot has to disable CONFIG_USBIP_VHCI_HCD/CONFIG_BT_HCIVHCI when it crosses v4.10 boundary. It fixes boot on the release and otherwise no bisection will succeed at all. It's just happened so that this particular bug is dependent on these exact configs and was introduced before v4.10. So it was bisected to v4.10. And in this sense it is working as intended. How would you define intended bisection behavior for the situation when kernel is build/boot/test broken most of the time, even on releases and even on recent releases? ;) I guess the 100% fair answer is "the bug happens as far as we could test (which is not too far)". And that's what I did initially, but the result was way less useful than what we have now. This and other details of the process are described here: https://github.com/google/syzkaller/blob/master/docs/syzbot.md#bisection This was the first attempt at giving more transparency into the process. I see 2 potential improvements: 1. (simpler) noting in the bisection log things like disabled configs, cherry-picked fixes and other things necessary to repair kernel. 2. (harder) try to figure out that the bug actually depends on the disabled config I've added this to https://github.com/google/syzkaller/issues/1051 But for (2) I would first like to see that this is a common enough problem rather then a one-off thing, because it's easier to say than to implement that reliably and this can affect bugs completely unrelated to the disabled configs due to unavoidable kernel crash flakes (and then somebody will need to explain what happened to all people asking). And obviously doing some real testing before merging each commit into any kernel tree would help tremendously with bisection long term ;) Even v5.0 is boot broken if I try to enable more configs. So we will need to disable more configs in bisection in future as we onboard them to syzbot. The current points in time we need to disable various configs suspiciously resemble when they were added to syzbot config...