On 5/22/23 21:42, Baolin Wang wrote:
On 5/18/2023 12:15 AM, Khalid Aziz wrote:
Pages pinned in memory through extra refcounts can not be migrated.
Currently as isolate_migratepages_block() scans pages for
compaction, it skips any pinned anonymous pages. All non-migratable
pages should be skipped and not just the anonymous pinned pages.
This patch adds a check for extra refcounts on a page to determine
if the page can be migrated. This was seen as a real issue on a
customer workload where a large number of pages were pinned by vfio
on the host and any attempts to allocate hugepages resulted in
significant amount of cpu time spent in either direct compaction or
in kcompactd scanning vfio pinned pages over and over again that can
not be migrated.
Signed-off-by: Khalid Aziz <khalid.aziz@xxxxxxxxxx>
Suggested-by: Steve Sistare <steven.sistare@xxxxxxxxxx>
---
v3:
- Account for extra ref added by get_page_unless_zero() earlier
in isolate_migratepages_block() (Suggested by Huang, Ying)
- Clean up computation of extra refs to be consistent
(Suggested by Huang, Ying)
v2:
- Update comments in the code (Suggested by Andrew)
- Use PagePrivate() instead of page_has_private() (Suggested
by Matthew)
- Pass mapping to page_has_extrarefs() (Suggested by Matthew)
- Use page_ref_count() (Suggested by Matthew)
- Rename is_pinned_page() to reflect its function more
accurately (Suggested by Matthew)
mm/compaction.c | 36 ++++++++++++++++++++++++++++++++----
1 file changed, 32 insertions(+), 4 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 5a9501e0ae01..f04c00981172 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -764,6 +764,34 @@ static bool too_many_isolated(pg_data_t *pgdat)
return too_many;
}
+/*
+ * Check if this base page should be skipped from isolation because
+ * it has extra refcounts that will prevent it from being migrated.
+ * This function is called for regular pages only, and not
+ * for THP or hugetlbfs pages. This code is inspired by similar code
+ * in migrate_vma_check_page(), can_split_folio() and
+ * folio_migrate_mapping()
+ */
+static inline bool page_has_extra_refs(struct page *page)
+{
+ /* caller holds a ref already from get_page_unless_zero() */
+ unsigned long extra_refs = 1;
+
+ /* anonymous page can have extra ref from swap cache */
+ if (PageAnon(page))
+ extra_refs += PageSwapCache(page) ? 1 : 0;
+ else
+ extra_refs += 1 + PagePrivate(page);
+
+ /*
+ * This is an admittedly racy check but good enough to determine
+ * if a page is pinned and can not be migrated
+ */
+ if ((page_ref_count(page) - extra_refs) > page_mapcount(page))
Could you explain why changing total_mapcount() to page_mapcount()? The original commit 829ae0f81ce0 ("mm: migrate: fix
THP's mapcount on isolation") did fix a THP's mapcount issue with converting to total_mapcount().
I had based this code on migrate_vma_check_page() which uses page_mapcount() but you are right. In the code for
page_has_extra_refs(), this should stay total_mapcount(). I will fix it.
Thanks,
Khalid
+ return true;
+ return false;
+}
+
/**
* isolate_migratepages_block() - isolate all migrate-able pages within
* a single pageblock
@@ -992,12 +1020,12 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
goto isolate_fail;
/*
- * Migration will fail if an anonymous page is pinned in memory,
- * so avoid taking lru_lock and isolating it unnecessarily in an
- * admittedly racy check.
+ * Migration will fail if a page has extra refcounts
+ * preventing it from migrating, so avoid taking
+ * lru_lock and isolating it unnecessarily
*/
mapping = page_mapping(page);
- if (!mapping && (page_count(page) - 1) > total_mapcount(page))
+ if (page_has_extra_refs(page))
goto isolate_fail_put;
/*