Re: [PATCH v6 4/5] mm/migrate: skip migrating folios under writeback with AS_WRITEBACK_INDETERMINATE mappings

Joanne Koong <joannelkoong@xxxxxxxxx> · Tue, 14 Jan 2025 10:58:04 -0800

On Tue, Jan 14, 2025 at 7:44 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>
> On Tue, 2025-01-14 at 09:38 +0100, Miklos Szeredi wrote:
> > On Mon, 13 Jan 2025 at 22:44, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> >
> > > What if we were to allow the kernel to kill off an unprivileged FUSE
> > > server that was "misbehaving" [1], clean any dirty pagecache pages that
> > > it has, and set writeback errors on the corresponding FUSE inodes [2]?
> > > We'd still need a rather long timeout (on the order of at least a
> > > minute or so, by default).
> >
> > How would this be different from Joanne's current request timeout patch?
> >
>
> When the timeout pops with Joanne's set, the pages still remain dirty
> (IIUC). The idea here would be that after a call times out and we've
> decided the server is "misbehaving", we'd want to clean the pages and
> mark the inode with a writeback error. That frees up the page to be
> migrated, but a later msync or fsync should return an error. This is
> the standard behavior for writeback errors on filesystems.

I think the pages already get cleaned and the inode marked with an
error in the case of a timeout. The timeout calls into the abort path,
so the abort path should already be doing this. When the connection is
aborted, fuse_request_end() will get invoked, which will call the
req->args->end() callback which for writebacks will be
fuse_writepage_end(). In fuse_writepage_end(), the inode->i_mapping
gets set to the error code and the writeback state will be cleared on
the folio as well (in fuse_writepage_finish()).

>
> > I think it makes sense, but it *has* to be opt in, for the same reason
> > that NFS soft timeout is opt in, so it can't really solve the page
> > migration issue generally.
> >
>
> Does it really need to be though? We're talking unprivileged mounts
> here. Imposing a hard timeout on reads or writes as a mechanism to
> limit resource consumption by an unprivileged user seems like a
> reasonable thing to do. Writeback errors suck, but what other recourse
> do we have in this situation?
>
> We could also consider only enforcing this when memory gets low, or a
> migration has failed.
>

I think there's a case to be made here that this "resource checking"
of unprivileged mounts should be behavior that already exists (eg
automatically enforcing timeouts instead of only by opt-in). The only
issue with this I see is that it might potentially break
backwards-compatibility, but I think it could be argued that
protecting memory resources outweighs that. Though the timeout would
have to be somewhat large, and I don't know if that would be
acceptable for migration.

Thanks,
Joanne

> > Also page reading has exactly the same issues, so fixing writeback is
> > not enough.
> >
>
> Reads are synchronous, so we could just return an error directly on
> those.
>
> > Maybe an explicit callback from the migration code to the filesystem
> > would work. I.e. move the complexity of dealing with migration for
> > problematic filesystems (netfs/fuse) to the filesystem itself.  I'm
> > not sure how this would actually look, as I'm unfamiliar with the
> > details of page migration, but I guess it shouldn't be too difficult
> > to implement for fuse at least.
> >
>
> We already have a ->migrate_folio operation. Maybe we could consider
> pushing down the PG_writeback check into the ->migrate_folio ops? As an
> initial step, we could just make them all return -EBUSY, and then allow
> some (like FUSE) to handle the situation properly.
> --
> Jeff Layton <jlayton@xxxxxxxxxx>