On Tue, Nov 29, 2022 at 04:04:35AM +0000, Al Viro wrote: > On Mon, Nov 28, 2022 at 02:57:49PM -0800, syzbot wrote: > > syzbot has found a reproducer for the following issue on: > > [snip] > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17219fbb880000 > > "syz_mount_image$ntfs3(" followed by arseloads of garbage. And the thing > conspiciously missing? Why, any ntfs3 maintainers in Cc... Or lists, > for that matter... > > > generic_file_read_iter+0x3d4/0x540 mm/filemap.c:2804 > > do_iter_read+0x6e3/0xc10 fs/read_write.c:796 > > vfs_readv fs/read_write.c:916 [inline] > > do_preadv+0x1f4/0x330 fs/read_write.c:1008 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x63/0xcd > > At a guess - something's screwed in ntfs3 ->direct_IO() (return value, most > likely). And something's screwed in syzbot. If you are fuzzing some > filesystem, YOU REALLY OUGHT TO CC THE MAINTAINERS OF THAT FILESYSTEM. > Even if nothing in the stack trace happens to be in that fs. The scheme which syzbot appears to use involves looking at the symbol in EIP from the stack trace to determine who to CC. This... mostly works, but occasionally results in hilarity. For example, there was the time when the fuzzing program fed some other file system (f2fs, as I recall) several hundred invalid file systems, and then for some reason it fed ext4 an invalid file system, and ext4 tripped on an invalid pointer dereference. Of course, just feeding ext4 the invalid file system had no issues, and a human being might have intuited that maybe the several hundred invalid f2fs file systems triggered some kind of memory corruption which ext4 then tripped across ---- but since the EIP was in the ext4 file system, the ext4 maintainers got cc'ed, and if you look in the dashboard, it just shows an ext4 symbol, so it's unlikely the f2fs developers would ever have discovered it on their own. (I did cc it to them, but they weren't able to get to it immediately, and it'll be hard to find it again, since we don't have a bug tracking system and there's no way to set some kind of "subsystem really at fault" state in the Syzkaller dashboard.) > Folks, it's that simple - "our bot needs to remember that fuzzing $FS > automatically puts maintainers of $FS into the set of people we need to Cc" > vs. "maintainers of each filesystem need to dig into every syzbot posting > on fsdevel (and follow links, no less) to check if their fs might be > involved". If you can't be bothered to take care of the former, why > would you expect $BIGNUM people to bother with the latter, again and > again and again? The strength and weakness of syzkaller is that it will combine fuzzing with, say, setting up and tearing down a gazllion wireguard tunnels, or some other random set of system calls. Sometimes that finds a real bug. Other times, for some strange reason, the syzkaller minimizer can't figure out that the extraneous noise in setting up and tearing down the network connections is pointless noise, except that on the specific hardware/VM used by syzkaller, it helps make it easier to trigger a timing-related bug --- but $DEITY help you if you try to reproduce on anything other than the specific VM used by the syzkaller bug. And then, of course, there are cases where the reproducer is only doing one thing, such as say messing with ntfs3, and the syzbot *should* be able to figure out a better set of maintainers to notify --- but then, given that the syzbot subjust line/summary is something generic, such as iov_iter_XXX, and there's no way to set up an affected subsystem state in the dashboard, good luck having anyone else find it in the syzkaller dashboard later on. > Fix your bot, already. It's not the first time this had been brought > to your attention and the problem is still there. I've brought this to the Syzkaller team's attention multiple times. Unfortunately, it's not exactly a trivial problem to solve, and other things have been considered higher priority. (Hint to the Syzkaller team; if you can prioritize and share a roadmap with upstream developer vis-a-vis upstream concerns, it might make some upstream developers a bit more receptive to patches designed to make life easier for syzkaller to find EVEN MORE FILESYSTEM FUZZING BUGS, such as [1]. Otherwise, it is perhaps understandable why some might consider this more of a threat rather than a benefit... Note: [1] doesn't make a difference for ext4 either way, since metadata checksums is a file system feature that can be enabled or disabled at mkfs time; this patch series is about finding more file system bugs for file systems which don't make disabling checksum to be an option, such as XFS.) [1] https://lore.kernel.org/all/20221014084837.1787196-1-hrkanabar@xxxxxxxxx/