On Fri, Nov 17, 2023 at 09:10:10AM +1030, Qu Wenruo wrote: > On 2023/11/17 00:53, Matthew Wilcox wrote: > > On Thu, Nov 16, 2023 at 04:00:40PM +1030, Qu Wenruo wrote: > > > On 2023/11/16 15:35, Matthew Wilcox wrote: > > > > On Thu, Nov 16, 2023 at 02:11:00PM +1030, Qu Wenruo wrote: > > > > > E.g. if I allocated a folio with order 2, attached some private data to > > > > > the folio, then call filemap_add_folio(). > > > > > > > > > > Later some one called find_lock_page() and hit the 2nd page of that folio. > > > > > > > > > > I believe the regular IO is totally fine, but what would happen for the > > > > > page->private of that folio? > > > > > Would them all share the same value of the folio_attach_private()? Or > > > > > some different values? > > > > > > > > Well, there's no magic ... > > > > > > > > If you call find_lock_page(), you get back the precise page. If you > > > > call page_folio() on that page, you get back the folio that you stored. > > > > If you then dereference folio->private, you get the pointer that you > > > > passed to folio_attach_private(). > > > > > > > > If you dereference page->private, *that is a bug*. You might get > > > > NULL, you might get garbage. Just like dereferencing page->index or > > > > page->mapping on tail pages. page_private() will also do the wrong thing > > > > (we could fix that to embed a call to page_folio() ... it hasn't been > > > > necessary before now, but if it'll help convert btrfs, then let's do it). > > > > > > That would be great. The biggest problem I'm hitting so far is the page > > > cache for metadata. > > > > > > We're using __GFP_NOFAIL for the current per-page allocation, but IIRC > > > __GFP_NOFAIL is ignored for higher order (>2 ?) folio allocation. > > > And we may want that per-page allocation as the last resort effort > > > allocation anyway. > > > > > > Thus I'm checking if there is something we can do here. > > > > > > But I guess we can always go folio_private() instead as a workaround for > > > now? > > > > I don't understand enough about what you're doing to offer useful > > advice. Is this for bs>PS or is it arbitrary large folios for better > > performance? If the latter, you can always fall back to order-0 folios. > > If the former, well, we need to adjust a few things anyway to handle > > filesystems with a minimum order ... > > > > In general, you should be using folio_private(). page->private and > > page_private() will be removed eventually. > > Just another question. > > What about flags like PageDirty? Are they synced with folio? Yes. You can SetPageDirty() in one function and then folio_test_dirty() in another. Eventually all the PageFoo() functions will be removed, except PageHWPoison and PageAnonExclusive. > The declaration goes PF_HEAD for policy, thus for order 0 it makes no > difference, but for higher order folios, we should switch to pure folio > based operations other than mixing page and folios? Every function in btrfs should be folio based. There should be nothing in btrfs that deals with pages. Take a look at iomap's buffered I/O paths for hints -- there's a per-block dirty and uptodate bit, but other than that, everything is done with folios.