[no subject]

**Date** **Thread**

Design
===

folio_split() splits a large folio in the same way as buddy allocator
splits a large free page for allocation. The purpose is to minimize the
number of folios after the split. For example, if user wants to free the
3rd subpage in a order-9 folio, folio_split() will split the order-9 folio
as:
O-0, O-0, O-0, O-0, O-2, O-3, O-4, O-5, O-6, O-7, O-8 if it is anon
O-1,      O-0, O-0, O-2, O-3, O-4, O-5, O-6, O-7, O-9 if it is pagecache
Since anon folio does not support order-1 yet.

The split process is similar to existing approach:
1. Unmap all page mappings (split PMD mappings if exist);
2. Split meta data like memcg, page owner, page alloc tag;
3. Copy meta data in struct folio to sub pages, but instead of spliting
   the whole folio into multiple smaller ones with the same order in a
   shot, this approach splits the folio iteratively. Taking the example
   above, this approach first splits the original order-9 into two order-8,
   then splits left part of order-8 to two order-7 and so on;
4. Post-process split folios, like write mapping->i_pages for pagecache,
   adjust folio refcounts, add split folios to corresponding list;
5. Remap split folios
6. Unlock split folios.

__folio_split_without_mapping() and __split_folio_to_order() replace
__split_huge_page() and __split_huge_page_tail() respectively.
__folio_split_without_mapping() uses different approaches to perform
uniform split and buddy allocator like split:
1. uniform split: one single call to __split_folio_to_order() is used to
   uniformly split the given folio. All resulting folios are put back to
   the list after split. The folio containing the given page is left to
   caller to unlock and others are unlocked.

2. buddy allocator like split: old_order - new_order calls to
   __split_folio_to_order() are used to split the given folio at order N to
   order N-1. After each call, the target folio is changed to the one
   containing the page, which is given via folio_split() parameters.
   After each call, folios not containing the page are put back to the list.
   The folio containing the page is put back to the list when its order
   is new_order. All folios are unlocked except the first folio, which
   is left to caller to unlock.

TODOs
===
1. xas_nomem() is used in buddy allocator like split like Matthew
   suggested, but I see kmemleak during its use. I would like to get some
   code review from Matthew on it.

2. A proper error handling is needed when xas_nomem() fails to allocate memory
   for xas_split(). The target folio should be put back to the list and
   all after-split folios should be unlocked. (I realize this when I am
   writing the cover letter)

Any comments and/or suggestions are welcome. Thanks.

[1] https://lore.kernel.org/linux-mm/20241008223748.555845-1-ziy@xxxxxxxxxx/

Zi Yan (3):
  mm/huge_memory: buddy allocator like folio_split()
  mm/huge_memory: add folio_split() to debugfs testing interface.
  mm/truncate: use folio_split() for truncate operation.

 include/linux/huge_mm.h |  12 +
 mm/huge_memory.c        | 650 +++++++++++++++++++++++++---------------
 mm/truncate.c           |   5 +-
 3 files changed, 422 insertions(+), 245 deletions(-)

-- 
2.45.2