Re: Mixed page compact code and (higher order) folios for filemap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2023/11/17 00:53, Matthew Wilcox wrote:
On Thu, Nov 16, 2023 at 04:00:40PM +1030, Qu Wenruo wrote:
On 2023/11/16 15:35, Matthew Wilcox wrote:
On Thu, Nov 16, 2023 at 02:11:00PM +1030, Qu Wenruo wrote:
E.g. if I allocated a folio with order 2, attached some private data to
the folio, then call filemap_add_folio().

Later some one called find_lock_page() and hit the 2nd page of that folio.

I believe the regular IO is totally fine, but what would happen for the
page->private of that folio?
Would them all share the same value of the folio_attach_private()? Or
some different values?

Well, there's no magic ...

If you call find_lock_page(), you get back the precise page.  If you
call page_folio() on that page, you get back the folio that you stored.
If you then dereference folio->private, you get the pointer that you
passed to folio_attach_private().

If you dereference page->private, *that is a bug*.  You might get
NULL, you might get garbage.  Just like dereferencing page->index or
page->mapping on tail pages.  page_private() will also do the wrong thing
(we could fix that to embed a call to page_folio() ... it hasn't been
necessary before now, but if it'll help convert btrfs, then let's do it).

That would be great. The biggest problem I'm hitting so far is the page
cache for metadata.

We're using __GFP_NOFAIL for the current per-page allocation, but IIRC
__GFP_NOFAIL is ignored for higher order (>2 ?) folio allocation.
And we may want that per-page allocation as the last resort effort
allocation anyway.

Thus I'm checking if there is something we can do here.

But I guess we can always go folio_private() instead as a workaround for
now?

I don't understand enough about what you're doing to offer useful
advice.  Is this for bs>PS or is it arbitrary large folios for better
performance?

The ultimate goal is to make nodesize (metadata block size) > PAGE_SIZE
case to go higher order folio by default, for better performance.

But use order 0 folios if we failed get higher order folios.

The current problem is the metadata allocation  here is always going
page based, and using page->private.

 If the latter, you can always fall back to order-0 folios.
If the former, well, we need to adjust a few things anyway to handle
filesystems with a minimum order ...

In general, you should be using folio_private().  page->private and
page_private() will be removed eventually.

OK, that sounds good, we can do the cleanup first inside btrfs.

Thanks,
Qu

The GFP_NOFAIL warning is:

         WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));






[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux