Adding Linus to the Cc (of this one only): he surely has an interest. On Fri, 30 Apr 2021, Matthew Wilcox (Oracle) wrote: > Managing memory in 4KiB pages is a serious overhead. Many benchmarks > benefit from a larger "page size". As an example, an earlier iteration > of this idea which used compound pages (and wasn't particularly tuned) > got a 7% performance boost when compiling the kernel. > > Using compound pages or THPs exposes a serious weakness in our type > system. Functions are often unprepared for compound pages to be passed > to them, and may only act on PAGE_SIZE chunks. Even functions which are > aware of compound pages may expect a head page, and do the wrong thing > if passed a tail page. > > There have been efforts to label function parameters as 'head' instead > of 'page' to indicate that the function expects a head page, but this > leaves us with runtime assertions instead of using the compiler to prove > that nobody has mistakenly passed a tail page. Calling a struct page > 'head' is also inaccurate as they will work perfectly well on base pages. > > We also waste a lot of instructions ensuring that we're not looking at > a tail page. Almost every call to PageFoo() contains one or more hidden > calls to compound_head(). This also happens for get_page(), put_page() > and many more functions. There does not appear to be a way to tell gcc > that it can cache the result of compound_head(), nor is there a way to > tell it that compound_head() is idempotent. > > This series introduces the 'struct folio' as a replacement for > head-or-base pages. This initial set reduces the kernel size by > approximately 6kB by removing conversions from tail pages to head pages. > The real purpose of this series is adding infrastructure to enable > further use of the folio. > > The medium-term goal is to convert all filesystems and some device > drivers to work in terms of folios. This series contains a lot of > explicit conversions, but it's important to realise it's removing a lot > of implicit conversions in some relatively hot paths. There will be very > few conversions from folios when this work is completed; filesystems, > the page cache, the LRU and so on will generally only deal with folios. > > The text size reduces by between 6kB (a config based on Oracle UEK) > and 1.2kB (allnoconfig). Performance seems almost unaffected based > on kernbench. > > Current tree at: > https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio > > (contains another ~120 patches on top of this batch, not all of which are > in good shape for submission) > > v8.1: > - Rebase on next-20210430 > - You need https://lore.kernel.org/linux-mm/20210430145549.2662354-1-willy@xxxxxxxxxxxxx/ first > - Big renaming (thanks to peterz): > - PageFoo() becomes folio_foo() > - SetFolioFoo() becomes folio_set_foo() > - ClearFolioFoo() becomes folio_clear_foo() > - __SetFolioFoo() becomes __folio_set_foo() > - __ClearFolioFoo() becomes __folio_clear_foo() > - TestSetPageFoo() becomes folio_test_set_foo() > - TestClearPageFoo() becomes folio_test_clear_foo() > - PageHuge() is now folio_hugetlb() > - put_folio() becomes folio_put() > - get_folio() becomes folio_get() > - put_folio_testzero() becomes folio_put_testzero() > - set_folio_count() becomes folio_set_count() > - attach_folio_private() becomes folio_attach_private() > - detach_folio_private() becomes folio_detach_private() > - lock_folio() becomes folio_lock() > - unlock_folio() becomes folio_unlock() > - trylock_folio() becomes folio_trylock() > - __lock_folio_or_retry becomes __folio_lock_or_retry() > - __lock_folio_async() becomes __folio_lock_async() > - wake_up_folio_bit() becomes folio_wake_bit() > - wake_up_folio() becomes folio_wake() > - wait_on_folio_bit() becomes folio_wait_bit() > - wait_for_stable_folio() becomes folio_wait_stable() > - wait_on_folio() becomes folio_wait() > - wait_on_folio_locked() becomes folio_wait_locked() > - wait_on_folio_writeback() becomes folio_wait_writeback() > - end_folio_writeback() becomes folio_end_writeback() > - add_folio_wait_queue() becomes folio_add_wait_queue() > - Add folio_young() and folio_idle() family of functions > - Move page_folio() to page-flags.h and use _compound_head() > - Make page_folio() const-preserving > - Add folio_page() to get the nth page from a folio > - Improve struct folio kernel-doc > - Convert folio flag tests to return bool instead of int > - Eliminate set_folio_private() > - folio_get_private() is the equivalent of page_private() (as folio_private() > is now a test for whether the private flag is set on the folio) > - Move folio_rotate_reclaimable() into this patchset > - Add page-flags.h to the kernel-doc > - Add netfs.h to the kernel-doc > - Add a family of folio_lock_lruvec() wrappers > - Add a family of folio_relock_lruvec() wrappers > > v7: > https://lore.kernel.org/linux-mm/20210409185105.188284-1-willy@xxxxxxxxxxxxx/ > > Matthew Wilcox (Oracle) (31): > mm: Introduce struct folio > mm: Add folio_pgdat and folio_zone > mm/vmstat: Add functions to account folio statistics > mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO > mm: Add folio reference count functions > mm: Add folio_put > mm: Add folio_get > mm: Add folio flag manipulation functions > mm: Add folio_young() and folio_idle() > mm: Handle per-folio private data > mm/filemap: Add folio_index, folio_file_page and folio_contains > mm/filemap: Add folio_next_index > mm/filemap: Add folio_offset and folio_file_offset > mm/util: Add folio_mapping and folio_file_mapping > mm: Add folio_mapcount > mm/memcg: Add folio wrappers for various functions > mm/filemap: Add folio_unlock > mm/filemap: Add folio_lock > mm/filemap: Add folio_lock_killable > mm/filemap: Add __folio_lock_async > mm/filemap: Add __folio_lock_or_retry > mm/filemap: Add folio_wait_locked > mm/swap: Add folio_rotate_reclaimable > mm/filemap: Add folio_end_writeback > mm/writeback: Add folio_wait_writeback > mm/writeback: Add folio_wait_stable > mm/filemap: Add folio_wait_bit > mm/filemap: Add folio_wake_bit > mm/filemap: Convert page wait queues to be folios > mm/filemap: Add folio private_2 functions > fs/netfs: Add folio fscache functions > > Documentation/core-api/mm-api.rst | 4 + > Documentation/filesystems/netfs_library.rst | 2 + > fs/afs/write.c | 9 +- > fs/cachefiles/rdwr.c | 16 +- > fs/io_uring.c | 2 +- > include/linux/memcontrol.h | 58 ++++ > include/linux/mm.h | 173 ++++++++++-- > include/linux/mm_types.h | 71 +++++ > include/linux/mmdebug.h | 20 ++ > include/linux/netfs.h | 77 +++-- > include/linux/page-flags.h | 222 +++++++++++---- > include/linux/page_idle.h | 99 ++++--- > include/linux/page_ref.h | 88 +++++- > include/linux/pagemap.h | 276 +++++++++++++----- > include/linux/swap.h | 7 +- > include/linux/vmstat.h | 107 +++++++ > mm/Makefile | 2 +- > mm/filemap.c | 295 ++++++++++---------- > mm/folio-compat.c | 37 +++ > mm/internal.h | 1 + > mm/memory.c | 8 +- > mm/page-writeback.c | 72 +++-- > mm/page_io.c | 4 +- > mm/swap.c | 18 +- > mm/swapfile.c | 8 +- > mm/util.c | 30 +- > 26 files changed, 1247 insertions(+), 459 deletions(-) > create mode 100644 mm/folio-compat.c > > -- > 2.30.2