On 25.04.24 00:46, Zi Yan wrote:
From: Zi Yan <ziy@xxxxxxxxxx> In __folio_remove_rmap(), a large folio is added to deferred split list if any page in a folio loses its final mapping. It is possible that the folio is unmapped fully, but it is unnecessary to add the folio to deferred split list at all. Fix it by checking folio->_nr_pages_mapped before adding a folio to deferred split list. If the folio is already on the deferred split list, it will be skipped. This issue applies to both PTE-mapped THP and mTHP. Commit 98046944a159 ("mm: huge_memory: add the missing folio_test_pmd_mappable() for THP split statistics") tried to exclude mTHP deferred split stats from THP_DEFERRED_SPLIT_PAGE, but it does not fix the above issue. A fully unmapped PTE-mapped order-9 THP was still
Once again: your patch won't fix it either.
added to deferred split list and counted as THP_DEFERRED_SPLIT_PAGE, since nr is 512 (non zero), level is RMAP_LEVEL_PTE, and inside deferred_split_folio() the order-9 folio is folio_test_pmd_mappable(). However, this miscount was present even earlier due to implementation, since PTEs are unmapped individually and first PTE unmapping adds the THP into the deferred split list.
It will still be present. Just less frequently.
With commit b06dc281aa99 ("mm/rmap: introduce folio_remove_rmap_[pte|ptes|pmd]()"), kernel is able to unmap PTE-mapped folios in one shot without causing the miscount, hence this patch. Signed-off-by: Zi Yan <ziy@xxxxxxxxxx> Reviewed-by: Yang Shi <shy828301@xxxxxxxxx> --- mm/rmap.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index a7913a454028..2809348add7b 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1553,9 +1553,10 @@ static __always_inline void __folio_remove_rmap(struct folio *folio, * page of the folio is unmapped and at least one page * is still mapped. */ - if (folio_test_large(folio) && folio_test_anon(folio)) - if (level == RMAP_LEVEL_PTE || nr < nr_pmdmapped) - deferred_split_folio(folio); + if (folio_test_large(folio) && folio_test_anon(folio) && + ((level == RMAP_LEVEL_PTE && atomic_read(mapped)) || + (level == RMAP_LEVEL_PMD && nr < nr_pmdmapped))) + deferred_split_folio(folio);
Please refrain from posting a new patch before the discussion on the old one is done.
See my comments on v2 why optimizing out the function call is a reasonable thing to do *where we cannot batch* and the misaccounting will still happen. But it can be done independently.
-- Cheers, David / dhildenb