On 14.01.25 16:49, Jeff Layton wrote:
On Tue, 2025-01-14 at 10:40 +0100, Miklos Szeredi wrote:
On Tue, 14 Jan 2025 at 09:38, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
Maybe an explicit callback from the migration code to the filesystem
would work. I.e. move the complexity of dealing with migration for
problematic filesystems (netfs/fuse) to the filesystem itself. I'm
not sure how this would actually look, as I'm unfamiliar with the
details of page migration, but I guess it shouldn't be too difficult
to implement for fuse at least.
Thinking a bit...
1) reading pages
Pages are allocated (PG_locked set, PG_uptodate cleared) and passed to
->readpages(), which may make the pages uptodate asynchronously. If a
page is unlocked but not set uptodate, then caller is supposed to
retry the reading, at least that's how I interpret
filemap_get_pages(). This means that it's fine to migrate the page
before it's actually filled with data, since the caller will retry.
It also means that it would be sufficient to allocate the page itself
just before filling it in, if there was a mechanism to keep track of
these "not yet filled" pages. But that probably off topic.
Sounds plausible.
2) writing pages
When the page isn't actually being copied, the writeback could be
cancelled and the page redirtied. At which point it's fine to migrate
it. The problem is with pages that are spliced from /dev/fuse and
control over when it's being accessed is lost. Note: this is not
actually done right now on cached pages, since writeback always copies
to temp pages. So we can continue to do that when doing a splice and
not risk any performance regressions.
Can we just cancel and redirty the page like that when doing a
WB_SYNC_ALL flush? I think we'd need to ensure that it gets a new
writeback attempt as soon as the migration is done if that's in
progress, no?
Yeah, that was one of my initial questions as well: could one
"transparently" (to user space) handle canceling writeback and simply
re-dirty the page.
--
Cheers,
David / dhildenb