On Tue, 2025-01-14 at 09:38 +0100, Miklos Szeredi wrote: > On Mon, 13 Jan 2025 at 22:44, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > > What if we were to allow the kernel to kill off an unprivileged FUSE > > server that was "misbehaving" [1], clean any dirty pagecache pages that > > it has, and set writeback errors on the corresponding FUSE inodes [2]? > > We'd still need a rather long timeout (on the order of at least a > > minute or so, by default). > > How would this be different from Joanne's current request timeout patch? > When the timeout pops with Joanne's set, the pages still remain dirty (IIUC). The idea here would be that after a call times out and we've decided the server is "misbehaving", we'd want to clean the pages and mark the inode with a writeback error. That frees up the page to be migrated, but a later msync or fsync should return an error. This is the standard behavior for writeback errors on filesystems. > I think it makes sense, but it *has* to be opt in, for the same reason > that NFS soft timeout is opt in, so it can't really solve the page > migration issue generally. > Does it really need to be though? We're talking unprivileged mounts here. Imposing a hard timeout on reads or writes as a mechanism to limit resource consumption by an unprivileged user seems like a reasonable thing to do. Writeback errors suck, but what other recourse do we have in this situation? We could also consider only enforcing this when memory gets low, or a migration has failed. > Also page reading has exactly the same issues, so fixing writeback is > not enough. > Reads are synchronous, so we could just return an error directly on those. > Maybe an explicit callback from the migration code to the filesystem > would work. I.e. move the complexity of dealing with migration for > problematic filesystems (netfs/fuse) to the filesystem itself. I'm > not sure how this would actually look, as I'm unfamiliar with the > details of page migration, but I guess it shouldn't be too difficult > to implement for fuse at least. > We already have a ->migrate_folio operation. Maybe we could consider pushing down the PG_writeback check into the ->migrate_folio ops? As an initial step, we could just make them all return -EBUSY, and then allow some (like FUSE) to handle the situation properly. -- Jeff Layton <jlayton@xxxxxxxxxx>