On 09/08/2023 20:17, David Hildenbrand wrote: > On 09.08.23 21:07, Ryan Roberts wrote: >> On 09/08/2023 09:32, David Hildenbrand wrote: >>> Let's track the total mapcount for all large folios in the first subpage. >>> >>> The total mapcount is what we actually want to know in folio_mapcount() >>> and it is also sufficient for implementing folio_mapped(). This also >>> gets rid of any "raceiness" concerns as expressed in >>> folio_total_mapcount(). >>> >>> With sub-PMD THP becoming more important and things looking promising >>> that we will soon get support for such anon THP, we want to avoid looping >>> over all pages of a folio just to calculate the total mapcount. Further, >>> we may soon want to use the total mapcount in other context more >>> frequently, so prepare for reading it efficiently and atomically. >>> >>> Make room for the total mapcount in page[1] of the folio by moving >>> _nr_pages_mapped to page[2] of the folio: it is not applicable to hugetlb >>> -- and with the total mapcount in place probably also not desirable even >>> if PMD-mappable hugetlb pages could get PTE-mapped at some point -- so we >>> can overlay the hugetlb fields. >>> >>> Note that we currently don't expect any order-1 compound pages / THP in >>> rmap code, and that such support is not planned. If ever desired, we could >>> move the compound mapcount to another page, because it only applies to >>> PMD-mappable folios that are definitely larger. Let's avoid consuming >>> more space elsewhere for now -- we might need more space for a different >>> purpose in some subpages soon. >>> >>> Maintain the total mapcount also for hugetlb pages. Use the total mapcount >>> to implement folio_mapcount(), total_mapcount(), folio_mapped() and >>> page_mapped(). >>> >>> We can now get rid of folio_total_mapcount() and >>> folio_large_is_mapped(), by just inlining reading of the total mapcount. >>> >>> _nr_pages_mapped is now only used in rmap code, so not accidentially >>> externally where it might be used on arbitrary order-1 pages. The remaining >>> usage is: >>> >>> (1) Detect how to adjust stats: NR_ANON_MAPPED and NR_FILE_MAPPED >>> -> If we would account the total folio as mapped when mapping a >>> page (based on the total mapcount), we could remove that usage. >>> >>> (2) Detect when to add a folio to the deferred split queue >>> -> If we would apply a different heuristic, or scan using the rmap on >>> the memory reclaim path for partially mapped anon folios to >>> split them, we could remove that usage as well. >>> >>> So maybe, we can simplify things in the future and remove >>> _nr_pages_mapped. For now, leave these things as they are, they need more >>> thought. Hugh really did a nice job implementing that precise tracking >>> after all. >>> >>> Note: Not adding order-1 sanity checks to the file_rmap functions for >>> now. For anon pages, they are certainly not required right now. >>> >>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >>> Cc: Jonathan Corbet <corbet@xxxxxxx> >>> Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> >>> Cc: Hugh Dickins <hughd@xxxxxxxxxx> >>> Cc: "Matthew Wilcox (Oracle)" <willy@xxxxxxxxxxxxx> >>> Cc: Ryan Roberts <ryan.roberts@xxxxxxx> >>> Cc: Yin Fengwei <fengwei.yin@xxxxxxxxx> >>> Cc: Yang Shi <shy828301@xxxxxxxxx> >>> Cc: Zi Yan <ziy@xxxxxxxxxx> >>> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> >> >> Other than the nits and query on zeroing _total_mapcount below, LGTM. If zeroing >> is correct: >> >> Reviewed-by: Ryan Roberts <ryan.roberts@xxxxxxx> > > Thanks for the review! > > [...] > >>> static inline int total_mapcount(struct page *page) >> >> nit: couldn't total_mapcount() just be implemented as a wrapper around >> folio_mapcount()? What's the benefit of PageCompound() check instead of just >> getting the folio and checking if it's large? i.e: > > Good point, let me take a look tomorrow if the compiler can optimize in both > cases equally well. > > [...] > >>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >>> index 5f498e8025cc..6a614c559ccf 100644 >>> --- a/mm/hugetlb.c >>> +++ b/mm/hugetlb.c >>> @@ -1479,7 +1479,7 @@ static void __destroy_compound_gigantic_folio(struct >>> folio *folio, >>> struct page *p; >>> atomic_set(&folio->_entire_mapcount, 0); >>> - atomic_set(&folio->_nr_pages_mapped, 0); >>> + atomic_set(&folio->_total_mapcount, 0); >> >> Just checking this is definitely what you intended? _total_mapcount is -1 when >> it means "no pages mapped", so 0 means 1 page mapped? > > I was blindly doing what _entire_mapcount is doing: zeroing out the values. ;) > > But let's look into the details: in __destroy_compound_gigantic_folio(), we're > manually dissolving the whole compound page. So instead of actually returning a > compound page to the buddy (where we would make sure the mapcounts are -1, to > then zero them out !), we simply zero-out the fields we use and then dissolve > the compound page: to be left with a bunch of order-0 pages where the memmap is > in a clean state. > > (the buddy doesn't handle that page order, so we have to do things manually to > get to order-0 pages we can reuse or free) > Yeah fair enough, thanks for the explanation.