On Mon, Aug 14, 2023 at 07:38:52AM -0400, Theodore Ts'o wrote: > On Sat, Aug 12, 2023 at 09:30:22PM -0700, Eric Biggers wrote: > > Well, one thing that the kernel community can do to make things better is > > identify when a large number of bug reports are caused by a single issue > > ("userspace can write to mounted block devices"), and do something about that > > underlying issue (https://lore.kernel.org/r/20230704122727.17096-1-jack@xxxxxxx) > > instead of trying to "fix" large numbers of individual "bugs". We can have 1000 > > bugs or 1 bug, it is actually our choice in this case. > > That's assuming the syzbot folks are willing to enable the config in > Jan's patch. The syzbot folks refused to enable it, unless the config > was gated on CONFIG_INSECURE, which I object to, because that's > presuming a threat model that we have not all agreed is valid. > > Or rather, if it *is* valid some community members (or cough, cough, > **companies**) need to step up and supply patches. As the saying > goes, "patches gratefully accepted". It is *not* the maintainer's > responsibility to grant every single person whining about a feature > request, or even a bug fix. They did disable CONFIG_XFS_SUPPORT_V4. Yes, there was some pushback, but they are very understandably (and correctly) concerned about ending up in a situation where buggy code is disabled for syzkaller but enabled for everyone else. They eventually did accept the proposal to disable CONFIG_XFS_SUPPORT_V4, for reasons including the fact that there is a concrete deprecation plan. (FWIW, the XFS maintainers had also pushed back strongly when I suggested adding a kconfig option for XFS v4 support back in 2018. Sometimes these things just take time.) Anyway, syzkaller is just the messenger that is reminding us of a problem. The actual problem here is that "make filesystems robust against concurrent changes to block device's page cache" is effectively unsolvable. *Every* memory access to the cache would need to use READ_ONCE/WRITE_ONCE, and *every* re-read of every field would need to be fully re-validated. PageChecked would need to go away for metadata, as it would be invalid to cache the checked status at all. There's basically zero chance of getting this correct; filesystems struggle to validate even the metadata read from disk correctly, so how are they going to successfully continually revalidate all cached metadata in memory? I don't understand why we would waste time trying to do that instead of focusing on an actual solution. Sure, probably The Linux Filesystem Maintainers(TM) don't have time to help with migrating legacy use cases of writing to mounted block devices, but that just means that someone needs to step up to do it. It doesn't mean that they need to instead waste time on pointless "fixes". Keep in mind, the syzkaller team isn't asking for these pointless "fixes" either. They'd very much prefer 1 fix to 1000 fixes. I think some confusion might be arising from the very different types of problems that syzkaller finds. Sometimes 1 syzkaller report == 1 bug == 1 high-priority "must fix" bug == 1 vulnerability == 1 fix needed. But in general syzkaller is just letting kernel developers know about a problem, and it is up to them to decide what to do about it. In this case there is one underlying issue that needs to be fixed, and the individual syzkaller reports that result from that issue are not important. - Eric