On Thu, Apr 1, 2021 at 2:19 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > On Wed, Mar 31, 2021 at 07:29:22AM +0300, Dan Carpenter wrote: > > On Tue, Mar 30, 2021 at 07:04:30PM +0200, Paolo Bonzini wrote: > > > On 30/03/21 17:26, syzbot wrote: > > > > Hello, > > > > > > > > syzbot found the following issue on: > > > > > > > > HEAD commit: 93129492 Add linux-next specific files for 20210326 > > > > git tree: linux-next > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=169ab21ad00000 > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=6f2f73285ea94c45 > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=015dd7cdbbbc2c180c65 > > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=119b8d06d00000 > > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=112e978ad00000 > > > > > > > > The issue was bisected to: > > > > > > > > commit d40b9fdee6dc819d8fc35f70c345cbe0394cde4c > > > > Author: Daniel Vetter <daniel.vetter@xxxxxxxx> > > > > Date: Tue Mar 16 15:33:01 2021 +0000 > > > > > > > > mm: Add unsafe_follow_pfn > > > > > > > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=122d2016d00000 > > > > final oops: https://syzkaller.appspot.com/x/report.txt?x=112d2016d00000 > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=162d2016d00000 > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > > Reported-by: syzbot+015dd7cdbbbc2c180c65@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > Fixes: d40b9fdee6dc ("mm: Add unsafe_follow_pfn") > > > > > > This is basically intentional because get_vaddr_frames is broken, isn't it? > > > I think it needs to be ignored in syzkaller. > > > > What? > > > > The bisect is wrong (because it's blaming the commit which added the > > warning instead of the commit which added the buggy caller) but the > > warning is correct. > > > > Plus users are going to be seeing this as well. According to the commit > > message for 69bacee7f9ad ("mm: Add unsafe_follow_pfn") "Unfortunately > > there's some users where this is not fixable (like v4l userptr of iomem > > mappings)". It sort of seems crazy to dump this giant splat and then > > tell users to ignore it forever because it can't be fixed... 0_0 > > I think the discussion conclusion was that this interface should not > be used by userspace anymore, it is obsolete by some new interface? > > It should be protected by some kconfig and the kconfig should be > turned off for syzkaller runs. If this is not a kernel bug, then it must not use WARN_ON[_ONCE]. It makes the kernel untestable for both automated systems and humans: https://lwn.net/Articles/769365/ <quote> Greg Kroah-Hartman raised the problem of core kernel API code that will use WARN_ON_ONCE() to complain about bad usage; that will not generate the desired result if WARN_ON_ONCE() is configured to crash the machine. He was told that the code should just call pr_warn() instead, and that the called function should return an error in such situations. It was generally agreed that any WARN_ON() or WARN_ON_ONCE() calls that can be triggered from user space need to be fixed. </quote> https://lore.kernel.org/netdev/20210413085522.2caee809@xxxxxxxxxxxxxxxxxx/ From: Steven Rostedt <quote> I agree. WARN_ON(_ONCE) should be reserved for anomalies that should not happen ever. Anything that the user could trigger, should not trigger a WARN_ON. A WARN_ON is perfectly fine for detecting an accounting error inside the kernel. I have them scattered all over my code, but they should never be hit, even if something in user space tries to hit it. (with an exception of an interface I want to deprecate, where I want to know if it's still being used ;-) Of course, that wouldn't help bots testing the code. And I haven't done that in years) Any anomaly that can be triggered by user space doing something it should not be doing really needs a pr_warn(). </quote> And if it's a kernel bug reachable from user-space, then I think this code should be removed entirely, not just on all testing systems. Or otherwise if we are not removing it for some reason, then it needs to be fixed.