On Fri, Apr 16, 2021 at 3:13 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > On Fri, 16 Apr 2021 09:51:45 +0200 > Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote: > > > > If you look at substantial base of bisection logs, you will find lots > > of cases where bug types, functions don't match. Kernel crashes > > differently even on the same revision. And obviously things change if > > you change revisions. Also if you see presumably a different bug, what > > does it say regarding the original bug. > > Yes, but there are also several types of cases where the issue will be the > same. Namely lockdep. I agree that use after free warnings can have a side > effect, and may be more difficult. But how do we know it's lockdep, rather than a use-after-free manifested as lockdep? A Significant portion of kernel bugs are caused by concurrency and can manifest in different ways, e.g. these are not lockdep, or WARN, or use-after-free, but rather a race in nature. > But there's many other bugs that remain > consistent across kernels. And if you stumble on one of them, look for it > only. For example? Does not look to be true for WARN, BUG, KASAN, "inconsistent lock state". > And if you hit another bug, and if it doesn't crash, then ignore it (of > course this could be an issue if you have panic on warning set). But > otherwise, just skip it. It's not possible to skip, say, BUG. And if we skip, say, a use-after-free, how do we know we are not making things worse? Because now we are running on corrupted memory, so anything can happen. Definitely a stray lockdep report can happen, or other way around not happen when it should... > > I would very much like to improve automatic bisection quality, but it > > does not look trivial at all. > > > > Some random examples where, say, your hypothesis of WARN-to-WARN, > > BUG-to-BUG does not hold even on the same kernel revision (add to this > > At least lockdep to lockdep, as when I do manual bisects, that's exactly > what I look for, and ignore all other warnings. And that has found the > problem commit pretty much every time. What lockdep bug types do you mean? All? In the examples above you can see at least "inconsistent lock state" mixed with 2 other completely different bug types. > > different revisions and the fact that a different bug does not give > > info regarding the original bug): > > > > Can you tell me that all these examples bisected to the commit that caused > the bug? Because if it did not, then you may have just proved my point ;-) I don't know now what was the result, but for a single run these were manifestations of the same root bug. E.g. see below, that's UAF in fuse_dev_do_read vs WARNING in request_end. request_end is also fuse. And you can see that a memory corruption causing a random bug type, in this case WARNING, but can as well be LOCKDEP. > > run #0: crashed: KASAN: use-after-free Read in fuse_dev_do_read > > run #1: crashed: WARNING in request_end > > run #2: crashed: KASAN: use-after-free Read in fuse_dev_do_read > > run #3: OK > > run #4: OK