Hi Dave, > KMSAN has been used for quite a long time with syzbot, however, > and it's supposed to find these problems, too. Yet it's only been > finding this for 6 months? As Alex already mentioned, there were big fs fuzzing improvements in 2022, and that's exactly when we started seeing "KMSAN: uninit-value in __crc32c_le_base" (I've just checked crash history). Before that moment the code was likely just not exercised on syzbot. On Fri, Dec 15, 2023 at 10:59 PM 'Dave Chinner' via syzkaller-bugs <syzkaller-bugs@xxxxxxxxxxxxxxxx> wrote: > > On Fri, Dec 15, 2023 at 03:41:49PM +0100, Alexander Potapenko wrote: > > > > You are right, syzbot used to mount XFS way before 2022. > > On the other hand, last fall there were some major changes to the way > > syz_mount_image() works, so I am attributing the newly detected bugs > > to those changes. > > Oh, so that's when syzbot first turned on XFS V5 format testing? > > Or was that done in April, when this issue was first reported? > > > Unfortunately we don't have much insight into reasons behind syzkaller > > being able to trigger one bug or another: once a bug is found for the > > first time, the likelihood to trigger it again increases, but finding > > it initially might be tricky. > > > > I don't understand much how trivial is the repro at > > https://gist.github.com/xrivendell7/c7bb6ddde87a892818ed1ce206a429c4, > > I just looked at it - all it does is create a new file. It's > effectively "mount; touch", which is exactly what I said earlier > in the thread should reproduce this issue every single time. > > > but overall we are not drilling deep enough into XFS. > > https://storage.googleapis.com/syzbot-assets/8547e3dd1cca/ci-upstream-kmsan-gce-c7402612.html > > (ouch, 230Mb!) shows very limited coverage. > > *sigh* > > Did you think to look at the coverage results to check why the > numbers for XFS, ext4 and btrfs are all at 1%? Hmmm, thanks for pointing it out! Our ci-upstream-kmsan-gce instance is configured in such a way that the fuzzer program is quite restricted in what it can do. Apparently, it also lacks capabilities to do mounts, so we get almost no coverage in fs/*/**. I'll check whether the lack of permissions to mount() was intended. On the other hand, the ci-upstream-kmsan-gce-386 instance does not have such restrictions at all and we do see fs/ coverage there: https://storage.googleapis.com/syzbot-assets/609dc759f08b/ci-upstream-kmsan-gce-386-0e389834.html It's still quite low for fs/xfs, which is explainable -- we almost immediately hit "KMSAN: uninit-value in __crc32c_le_base". For the same reason, it's also somewhat lower than could be elsewhere as well -- we spend too much time restarting VMs after crashes. Once the fix patch reaches the fuzzed kernel tree, ci-upstream-kmsan-gce-386 should be back to normal. If we want to see how deep syzbot can go into the fs/ code in general, it's better to look at the KASAN instance coverage: https://storage.googleapis.com/syzbot-assets/12b7d6ca74e6/ci-upstream-kasan-gce-root-0e389834.html (*) Here e.g. fs/ext4 is already 63% and fs/xfs is 16%. (*) Be careful, the file is very big. -- Aleksandr > Why didn't the low > number make you dig a bit deeper to see if the number was real or > whether there was a test execution problem during measurement? > > I just spent a minute doing exactly that, and the answer is > pretty obvious. Both ext4 and XFS had a mount attempts > rejected at mount option parsing, and btrfs rejected a device scan > ioctl. That's it. Nothing else was exercised in those three > filesystems. > > Put simply: the filesystems *weren't tested during coverage > measurement*. > > If you are going to do coverage testing, please measure coverage > over *thousands* of different tests performed on a single filesystem > type. It needs to be thousands, because syzbot tests are so shallow > and narrow that actually covering any significant amount of > filesystem code is quite difficult.... > > -Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx > > --