On Mon, Mar 2, 2020 at 2:24 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > > On Sun, Mar 1, 2020 at 9:13 PM syzbot > > > > > <syzbot+66a9752fa927f745385e@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > Hello, > > > > > > > > > > > > syzbot found the following crash on: > > > > > > > > > > > > HEAD commit: f8788d86 Linux 5.6-rc3 > > > > > > git tree: upstream > > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=13c5f8f9e00000 > > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=5d2e033af114153f > > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=66a9752fa927f745385e > > > > > > compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) > > > > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=131d9a81e00000 > > > > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=14117a81e00000 > > > > > > > > > > > > > > > > Dmitry, > > > > > > > > > > There is something strange about the C repro. > > > > > It passes an invalid address for the first arg of mount syscall: > > > > > > > > > > syscall(__NR_mount, 0x400000ul, 0x20000000ul, 0x20000080ul, 0ul, > > > > > 0x20000100ul); > > > > > > > > > > With this address mount syscall returns -EFAULT on my system. > > > > > I fixed this manually, but repro did not trigger the reported bug on my system. > > > > > > > > Hi Amir, > > > > > > > > This is not strange in the context of fuzzer, it's goal is to pass > > > > random data. Generally if it says 0x400000ul, that's what it is, don't > > > > fix it, or you are running a different program that may not reproduce > > > > the bug. If syzbot attaches a reproducer, the bug was triggered by > > > > precisely this program. > > > > > > > > > > What's strange it that a bug in overlay code cannot be triggered if overlay > > > isn't mounted and as it is the repro couldn't mount overlayfs at all, at > > > lease with my kernel config. > > > > Can it depend on kernel config? The bug was triggered by the program > > provided somehow. > > I am not sure. I do not have CONFIG_HARDENED_USERCOPY set. > > > > > Separate question: why is it failing? Isn't src unused for overlayfs? > > Where/how does vfs code look at src? > > > > SYSCALL_DEFINE5(mount, ... > copy_mount_string(dev_name) > strndup_user() > memdup_user() > copy_from_user() > > Not in overlayfs code. > Actually, the source (dev) is not used by overlayfs but is visible at > /proc/mounts. Oh, I see, this is another instance of "fuzzer fun". In the descriptions we define src argument as const 0. And const 0 is fine and is accepted by copy_mount_string (it has a check for NULL). Generally fuzzer does not try to change values specified as const, but sometimes it does. So I guess it happened so that address 0x400000ul is mapped onto the executable and contained something that resembles a null-terminated string so that copy_mount_string did not fail (but otherwise that string does not matter much for overlayfs). But in your binary 0x400000ul did not contain an addressable null-terminated string and mount failed. Additionally we don't attempt changing const value back to the default value during crash mimization/simplification process: https://github.com/google/syzkaller/blob/4a4e0509de520c7139ca2b5606712cbadc550db2/prog/minimization.go#L202-L206 because it was deemed too expensive (for each attempt we need a freshly booted and clean machine) and not important enough (just a single arg value and does not increase "systematic complexity" of the repro). All of this has combined into the effect we see here... I am not sure what's the action item here... FWIW fuzzer-found will always be more expensive to debug and deal with for a very long tail of various reasons. Unit tests don't have this problem. If only we had a comprehensive test coverage for kernel, we would not need to deal with so many fuzzer-found bugs... ;)