On Fri, Sep 17, 2021 at 12:31:36PM -0400, Johannes Weiner wrote: > My question for fs folks is simply this: as long as you can pass a > folio to kmap and mmap and it knows what to do with it, is there any > filesystem relevant requirement that the folio map to 1 or more > literal "struct page", and that folio_page(), folio_nr_pages() etc be > part of the public API? In the short term, yes, we need those things in the public API. In the long term, not so much. We need something in the public API that tells us the offset and size of the folio. Lots of page cache code currently does stuff like calculate the size or iteration counts based on the difference of page->index values (i.e. number of pages) and iterate page by page. A direct conversion of such algorithms increments by folio_nr_pages() instead of 1. So stuff like this is definitely necessary as public APIs in the initial conversion. Let's face it, folio_nr_pages() is a huge improvement on directly exposing THP/compound page interfaces to filesystems and leaving them to work it out for themselves. So even in the short term, these API members represent a major step forward in mm API cleanliness. As for long term, everything in the page cache API needs to transition to byte offsets and byte counts instead of units of PAGE_SIZE and page->index. That's a more complex transition, but AFAIA that's part of the future work Willy is intended to do with folios and the folio API. Once we get away from accounting and tracking everything as units of struct page, all the public facing APIs that use those units can go away. It's fairly slow to do this, because we have so much code that is doing stuff like converting file offsets between byte counts and page counts and vice versa. And it's not necessary to do an initial conversion to folios, either. But once everything in the page cache indexing API moves to byte ranges, the need to count pages, use page counts are ranges, iterate by page index, etc all goes away and hence those APIs can also go away. As for converting between folios and pages, we'll need those sorts of APIs for the foreseeable future because low level storage layers and hardware use pages for their scatter gather arrays and at some point we've got to expose those pages from behind the folio API. Even if we replace struct page with some other hardware page descriptor, we're still going to need such translation APIs are some point in the stack.... > Or can we keep this translation layer private > to MM code? And will page_folio() be required for anything beyond the > transitional period away from pages? No idea, but as per above I think it's a largely irrelevant concern for the forseeable future because pages will be here for a long time yet. > Can we move things not used outside of MM into mm/internal.h, mark the > transitional bits of the public API as such, and move on? Sure, but that's up to you to do as a patch set on top of Willy's folio trees if you think it improves the status quo. Write the patches and present them for review just like everyone else does, and they can be discussed on their merits in that context rather than being presented as a reason for blocking current progress on folios. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx