> #syz set subsystems: mm Your commands are accepted, but please keep syzkaller-bugs@xxxxxxxxxxxxxxxx mailing list in CC next time. It serves as a history of what happened with each bug report. Thank you. > > On Wed, Jun 08, 2022 at 04:36:20AM -0700, syzbot wrote: >> syzbot has found a reproducer for the following issue on: >> >> HEAD commit: cf67838c4422 selftests net: fix bpf build error >> git tree: net >> console+strace: https://syzkaller.appspot.com/x/log.txt?x=123c2173f00000 >> kernel config: https://syzkaller.appspot.com/x/.config?x=fc5a30a131480a80 >> dashboard link: https://syzkaller.appspot.com/bug?extid=ecab51a4a5b9f26eeaa1 >> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 >> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1342d5abf00000 >> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11ecafebf00000 > > The root cause of this failure is a fundamental bug / design flaw in > get_user_pages and related functions, which file system developers > have been complaining about for literally **years**. See the recent > discussion at [1] and going back earlier to 2018[2][3] and 2019[4]. > > [1] https://lore.kernel.org/all/6b73e692c2929dc4613af711bdf92e2ec1956a66.1682638385.git.lstoakes@xxxxxxxxx/ > [2] https://lwn.net/Articles/753027/ > [3] https://lwn.net/Articles/774411/ > [4] https://lwn.net/Articles/784574/ > > I'm going to reassign this to the mm subsystem, since there's not much > we can do on the file system end. The WARNING is considered a good > thing because users can see silent data corruption/loss if they use > process_vm_writev() or RDMA to write to memory backed by a file. And > while most users at large hyperscale scientific compute farms probably > won't be paying attention to the system logs, at least we've done > something to warn them. > > Fortunately data corruption is rare (but when it happens it could > really screw with your results!), but if they are doing some large > scale simulation to evaluate the safety of nuclear weapons (for > example), it would be nice if they got at least some hint. > > There is a potential solution discussed at [1], but there is push back > since it could break users by disallowing the thing that might cause > data corruption. Why breaking user applications is bad, turning a > possible silent data corruption to a very visible, hard failure is > arguably a good thing.... > > - Ted