Re: [PATCH v6 4/5] mm/migrate: skip migrating folios under writeback with AS_WRITEBACK_INDETERMINATE mappings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 14, 2025 at 2:07 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
>
> On Tue, 14 Jan 2025 at 10:55, Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote:
> >
> >
> >
> > On 1/14/25 10:40, Miklos Szeredi wrote:
> > > On Tue, 14 Jan 2025 at 09:38, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> > >
> > >> Maybe an explicit callback from the migration code to the filesystem
> > >> would work. I.e. move the complexity of dealing with migration for
> > >> problematic filesystems (netfs/fuse) to the filesystem itself.  I'm
> > >> not sure how this would actually look, as I'm unfamiliar with the
> > >> details of page migration, but I guess it shouldn't be too difficult
> > >> to implement for fuse at least.
> > >
> > > Thinking a bit...
> > >
> > > 1) reading pages
> > >
> > > Pages are allocated (PG_locked set, PG_uptodate cleared) and passed to
> > > ->readpages(), which may make the pages uptodate asynchronously.  If a
> > > page is unlocked but not set uptodate, then caller is supposed to
> > > retry the reading, at least that's how I interpret
> > > filemap_get_pages().   This means that it's fine to migrate the page
> > > before it's actually filled with data, since the caller will retry.
> > >
> > > It also means that it would be sufficient to allocate the page itself
> > > just before filling it in, if there was a mechanism to keep track of
> > > these "not yet filled" pages.  But that probably off topic.
> >
> > With /dev/fuse buffer copies should be easy - just allocate the page
> > on buffer copy, control is in libfuse.
>
> I think the issue is with generic page cache code, which currently
> relies on the PG_locked flag on the allocated but not yet filled page.
>   If the generic code would be able to keep track of "under
> construction" ranges without relying on an allocated page, then the
> filesystem could allocate the page just before copying the data,
> insert the page into the cache mark the relevant portion of the file
> uptodate.
>
> > With splice you really need
> > a page state.
>
> It's not possible to splice a not-uptodate page.
>
> > I wrote this before already - what is the advantage of a tmp page copy
> > over /dev/fuse buffer copy? I.e. I wonder if we need splice at all here.
>
> Splice seems a dead end, but we probably need to continue supporting
> it for a while for backward compatibility.

For the splice case, could we do something like this or is this too invasive?:
* in mm, add a flag that marks a page as either being in migration or
temporarily blocking migration
* in splice, when we have to access the page in the pipe buffer, check
if that flag is set and wait for the migration to complete before
proceeding
* in splice, set that flag while it's accessing the page, which will
only temporarily block migration (eg for the duration of the memcpy)

I guess this is basically what the page lock is for, but with less overhead?

I need to look more at the splice code to see how it works, but
something like this would allow us to cancel writeback on spliced
pages that have already been sent to userspace if the request is
taking too long, and migration would never get stalled. Though I guess
the flag would be pretty specific only to the migration use case,
which might be a waste of a bit.


Thanks,
Joanne

>
> Thanks,
> Miklos





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux