On Fri, Sep 08, 2023 at 01:55:11AM -0700, Christoph Hellwig wrote: > On Wed, Sep 06, 2023 at 09:06:21AM +1000, Dave Chinner wrote: > > I think this completely misses the point of contention of the larger > > syzbot vs filesystem discussion: the assertion that "testing via > > syzbot means the subsystem is secure" where "secure" means "can be > > used safely for operations that involve trust model violations". > > > > Fundamentally, syzbot does nothing to actually validate the > > filesystem is "secure". Fuzzing can only find existing bugs by > > simulating an attacker, but it does nothing to address the > > underlying issues that allow that attack channel to exist. > > I don't think anyone makes that assertation. Instead the assumptions > is something that is handling untrusted input should be available to > surive fuzzing by syzbot, and that's an assumption I agree with. That > doesn't imply anything surving syzbot is secure, but it if doesn't > survive syzbot it surely can't deal with untrusted input. Sure, but as an experienced filesystem developer who, 15 years ago, architected and implemented a metadata verification mechanism that effectively defeats *random bit mutation metadata fuzzing*, I am making sure that everyone is aware that "syzbot doesn't find problems" is not the same thing as "filesystem is safe to handle untrusted input". Sure, syzbot being unable to find problems is a good start, but I know *many* ways to screw over the XFS kernel implementation by mutating the metadata in nasty ways that we *can't actually protect against* at runtime, and that syzbot is *never* going to stumble across by a random walk through all the possible bit mutations that can occur in a filesystem's metadata. I stress this again: syzbot not finding problems does not, in any way, imply that a filesytem implementation is safe to parse untrusted filesystem images in a ring 0 context. Anyone who says that "syzbot doesn't find problems, so it's good to go with untrusted input" is completely ignoring the long standing and well known practical limitations of the fuzzing techniques being used by tools like syzbot... > > > unmaintained. If we want to move the kernel forward by finishing > > > API transitions (new mount API, buffer_head removal for the I/O path, > > > ->writepage removal, etc) these file systems need to change as well > > > and need some kind of testing. The easiest way forward would be > > > to remove everything that is not fully maintained, but that would > > > remove a lot of useful features. > > > > Linus has explicitly NACKed that approach. > > > > https://lore.kernel.org/linux-fsdevel/CAHk-=wg7DSNsHY6tWc=WLeqDBYtXges_12fFk1c+-No+fZ0xYQ@xxxxxxxxxxxxxx/ > > .. and that is why I'm bring this up in a place where we can have > a proper procedural discussion instead of snarky remarks. This is > a fundamental problem we;ll need to sort out. I agree, which is why I'm trying to make sure that everyone has the same understanding of the situation. Allowing filesystems to parse untrusted data in ring 0 context comes down how which filesystem developers actually trust their code and on-disk format verification enough to allow it to be exposed willingly to untrusted input. Make no mistake about it: I'm not willing to take that risk with XFS. I'm not willing to take responsibility for deciding that we should expose XFS to untrusted code - I *know* that it isn't safe, and it would be gross negligence for me to present the code that I help maintain and develop any other way. > > Which is a problem, because historically we've taken code into > > the kernel without requiring a maintainer, or the people who > > maintained the code have moved on, yet we don't have a policy for > > removing code that is slowly bit-rotting to uselessness. > > ... and we keep merging crap that goes against all established normal > requirements when people things it's new and shiny and cool :( Well, yes, but that's a separate (though somewhat related) discussion. The observation I'd make from your comment is that the Linux project, as a whole, has no clearly defined feature life-cycle process. For the purpose of this discussion, we're concerned about the end-of-life process for removing ancient, obsolete and/or broken code in a sane, timely manner that we are completely lacking. A project that has been going for 30 years, and is likely to be going for another 30 years, needs to have a well defined EOL process. Not just for filesystems, but for everything: syscalls, drivers, platforms, sysfs interfaces, etc. The current process of "send an email, and if anyone shouts don't remove it" means that as long as there's a single user left, we can't get rid of the junk that is causing us problems right now. That's a terrible policy. As long as a single person has something on their shelf they want to have keep working, we're supposed to keep it working. In the cases where the developer time to keep the feature working outweighs the number of users, the cost/benefit ratio is so so far on the "cost" side it is not funny. And when it comes to filesystems, the risk/benefit analysis is pegged as hard as it can be against the "risk" side. IOWs, there's a wider scope here than just "how do we manage all these obsolete, buggy, legacy filesystems?". I points to the fact that the Linux project itself doesn't really know how to remove old code and features that have become a burden to ongoing development.... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx