On 14.01.25 21:51, Joanne Koong wrote:
On Tue, Jan 14, 2025 at 2:07 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
On Tue, 14 Jan 2025 at 10:55, Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote:
On 1/14/25 10:40, Miklos Szeredi wrote:
On Tue, 14 Jan 2025 at 09:38, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
Maybe an explicit callback from the migration code to the filesystem
would work. I.e. move the complexity of dealing with migration for
problematic filesystems (netfs/fuse) to the filesystem itself. I'm
not sure how this would actually look, as I'm unfamiliar with the
details of page migration, but I guess it shouldn't be too difficult
to implement for fuse at least.
Thinking a bit...
1) reading pages
Pages are allocated (PG_locked set, PG_uptodate cleared) and passed to
->readpages(), which may make the pages uptodate asynchronously. If a
page is unlocked but not set uptodate, then caller is supposed to
retry the reading, at least that's how I interpret
filemap_get_pages(). This means that it's fine to migrate the page
before it's actually filled with data, since the caller will retry.
It also means that it would be sufficient to allocate the page itself
just before filling it in, if there was a mechanism to keep track of
these "not yet filled" pages. But that probably off topic.
With /dev/fuse buffer copies should be easy - just allocate the page
on buffer copy, control is in libfuse.
I think the issue is with generic page cache code, which currently
relies on the PG_locked flag on the allocated but not yet filled page.
If the generic code would be able to keep track of "under
construction" ranges without relying on an allocated page, then the
filesystem could allocate the page just before copying the data,
insert the page into the cache mark the relevant portion of the file
uptodate.
With splice you really need
a page state.
It's not possible to splice a not-uptodate page.
I wrote this before already - what is the advantage of a tmp page copy
over /dev/fuse buffer copy? I.e. I wonder if we need splice at all here.
Splice seems a dead end, but we probably need to continue supporting
it for a while for backward compatibility.
For the splice case, could we do something like this or is this too invasive?:
* in mm, add a flag that marks a page as either being in migration or
temporarily blocking migration
* in splice, when we have to access the page in the pipe buffer, check
if that flag is set and wait for the migration to complete before
proceeding
* in splice, set that flag while it's accessing the page, which will
only temporarily block migration (eg for the duration of the memcpy)
> > I guess this is basically what the page lock is for, but with less
overhead?
Yes, the folio lock kind-of behaves that way.
One problem might be, that while the page is spliced that there is a
raised refcount on the page: migration cannot make progress if there are
unknown references.
--
Cheers,
David / dhildenb