On Tue, 2025-01-14 at 10:40 +0100, Miklos Szeredi wrote: > On Tue, 14 Jan 2025 at 09:38, Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > > Maybe an explicit callback from the migration code to the filesystem > > would work. I.e. move the complexity of dealing with migration for > > problematic filesystems (netfs/fuse) to the filesystem itself. I'm > > not sure how this would actually look, as I'm unfamiliar with the > > details of page migration, but I guess it shouldn't be too difficult > > to implement for fuse at least. > > Thinking a bit... > > 1) reading pages > > Pages are allocated (PG_locked set, PG_uptodate cleared) and passed to > ->readpages(), which may make the pages uptodate asynchronously. If a > page is unlocked but not set uptodate, then caller is supposed to > retry the reading, at least that's how I interpret > filemap_get_pages(). This means that it's fine to migrate the page > before it's actually filled with data, since the caller will retry. > > It also means that it would be sufficient to allocate the page itself > just before filling it in, if there was a mechanism to keep track of > these "not yet filled" pages. But that probably off topic. > Sounds plausible. > 2) writing pages > > When the page isn't actually being copied, the writeback could be > cancelled and the page redirtied. At which point it's fine to migrate > it. The problem is with pages that are spliced from /dev/fuse and > control over when it's being accessed is lost. Note: this is not > actually done right now on cached pages, since writeback always copies > to temp pages. So we can continue to do that when doing a splice and > not risk any performance regressions. > Can we just cancel and redirty the page like that when doing a WB_SYNC_ALL flush? I think we'd need to ensure that it gets a new writeback attempt as soon as the migration is done if that's in progress, no? -- Jeff Layton <jlayton@xxxxxxxxxx>