On Wed, Jan 2, 2019 at 3:08 PM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote: > > On Fri, Dec 28, 2018 at 10:09 PM Linus Torvalds > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > > > On Fri, Dec 28, 2018 at 1:43 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote: > > > > > > > Nobody reads the kernel mailing list directly - there's just too much traffic. > > > > > > As the result bug reports and patches got lots and this is bad and it > > > would be useful to stop it from happening and there are known ways for > > > this. > > > > Well, let me be a bit more specific: you will find that people read > > the very _targeted_ mailing lists, because they not only tend to be > > more specific to some particular interest, but also aren't the flood > > of hundreds of emails a day. > > > > And don't get me wrong: I'm not saying that lkml is useless. Not at > > all. It's just that it's really more of an archival model than a > > "people read it" - so you send your emails to a group of people, and > > then you cc lkml so that when that group gets expanded people can be > > pointed at the whole thread. Or, obviously, so that commit messages > > etc can point to discussion. > > > > But that does mean that any lkml cc shouldn't be expected to cause a > > reaction in itself. It's about other things. > > > > > syzbot not doing bisection is not the root cause of this > > > > Root case? No. But if you do bisection, it means that you can now > > target things much better. So then it's not lkml and "random > > collection of maintainers", but a much more targeted group. > > > > And that targeted group also ends up being a lot more receptive to it. > > > > Again, look at the raw syzbot email and the email by Wanpeng Li. Yes, > > the syzbot email did bring in a reasonable set of people just based on > > the oops (I think it did "get_mainainter" on kvm_ioapic_scan_entry()). > > But Wangpeng ended up sending it to the *particular* people who were > > directly responsible. > > > > > 2. syzbot reports are not worse then average human reports, frequently better. > > > > No, they really aren't. > > > > They are better in a *technical* sense, but they are also very much > > obviously automated, which makes the target people take them much less > > seriously. > > > > When you see lots of syzbot emails, and there are lots of more or less > > random recipients that may or may not be correct, what's the natural > > reaction to that? > > > > Look up "bystander effect". > > > > > 3. Bisection is useful, but not important in most cases. > > > > No. > > > > Exactly because of the problem syzbot has. It's too scatter-shot. > > People clearly ignore it, because people feel it's not _their_ issue. > > > > The advantage of bisection is that it makes the problem much more > > specific. Right now, you'll find that many developers ignore syzbot > > simply because it's not worth their time to chase down whether it's > > even their problem. > > > > See what I'm saying? > > > > It's the whole "data vs information" issue. Particularly when cc'ing > > maintainers, who get hundreds of emails a day, you need to convince > > them that this email is _relevant_. > > I see what you are saying and I agree that bisection results will make > reports better in some cases. But I mean a more general problem. > > Say you reported a bug, and it happened so that you missed that single > right person in CC because something, whatever, can happen, right? > With the current process it will be a coin flip if your report will be > routed to the right person or lost. And it's not that you personally > care a lot about this particular bug, it just happened that you > noticed it and wanted to be a good samaritan. So you will not keep > track of it on a post-note on your monitor and won't ping later. But > the bug can be bad and either cause security problems later, or reach > release and break things in the field and then require 1000x more work > to port the fix to all downstream forks. > > Or, we heavily rely on end users for testing. End users are not kernel > developers and can't be generally expected to do pre-triage and proper > routing. Losing these valuable reports is bad because only small > fraction of users report anything to projects and this can also affect > user trust, if you see that your reports are not acted on, you don't > report next time. > > Even if we take syzbot, it won't be able to bisect all the time for > multiple reasons: > - some bugs don't have reproducers (but still very real and sometimes > manageable to fix) > - kernel is build/boot broken sometimes for prolonged periods > - some old bugs are bisected to introduction of the debugging tool > that detects the bug > - some crashes can be too flaky for reliable bisection > - some reproducers won't work on older kernels, yet the bug is there > - ... > So it's will be nice to have bisection results when they are > available, but it does not feel like it should be the only guarantee > of a bug report not being lost. > > Moreover, you can see in the examples I referenced above that they > were delivered to the right people, but then still lost because there > is nothing in the kernel development process that would prevent loses. > > Moreover, replying on a small set of private emails generally creates > problems wrt bus-factor and vacations. It would be useful if anybody > could see what are the open bugs for rdma_cm subsystem at any point in > time. This is quite indicative: Serious issues affecting all filesystems: Kernel quality control, or the lack thereof https://lwn.net/Articles/774114/ Comment on ycombinator: https://news.ycombinator.com/item?id=18844612 I've filed bugs for some of the mentioned copy_file_range() issues more than two years ago: - https://bugzilla.kernel.org/show_bug.cgi?id=135461 - https://bugzilla.kernel.org/show_bug.cgi?id=135451 No response...