The patch titled Subject: mm/migrate: split source folio if it is on deferred split list has been added to the -mm mm-unstable branch. Its filename is mm-migrate-split-source-folio-if-it-is-on-deferred-split-list.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-migrate-split-source-folio-if-it-is-on-deferred-split-list.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Zi Yan <ziy@xxxxxxxxxx> Subject: mm/migrate: split source folio if it is on deferred split list Date: Tue, 19 Mar 2024 11:47:53 -0400 If the source folio is on deferred split list, it is likely some subpages are not used. Split it before migration to avoid migrating unused subpages. Commit 616b8371539a6 ("mm: thp: enable thp migration in generic path") did not check if a THP is on deferred split list before migration, thus, the destination THP is never put on deferred split list even if the source THP might be. The opportunity of reclaiming free pages in a partially mapped THP during deferred list scanning is lost, but no other harmful consequence is present[1]. [1]: https://lore.kernel.org/linux-mm/03CE3A00-917C-48CC-8E1C-6A98713C817C@xxxxxxxxxx/ Link: https://lkml.kernel.org/r/20240319154753.253262-1-zi.yan@xxxxxxxx Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path") Signed-off-by: Zi Yan <ziy@xxxxxxxxxx> Cc: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> Cc: Huang, Ying <ying.huang@xxxxxxxxx> Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> Cc: "Matthew Wilcox (Oracle)" <willy@xxxxxxxxxxxxx> Cc: Ryan Roberts <ryan.roberts@xxxxxxx> Cc: Yang Shi <shy828301@xxxxxxxxx> Cc: Yin Fengwei <fengwei.yin@xxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/huge_memory.c | 22 ---------------- mm/internal.h | 23 +++++++++++++++++ mm/migrate.c | 60 ++++++++++++++++++++++++++++++++++++--------- 3 files changed, 72 insertions(+), 33 deletions(-) --- a/mm/huge_memory.c~mm-migrate-split-source-folio-if-it-is-on-deferred-split-list +++ a/mm/huge_memory.c @@ -766,28 +766,6 @@ pmd_t maybe_pmd_mkwrite(pmd_t pmd, struc return pmd; } -#ifdef CONFIG_MEMCG -static inline -struct deferred_split *get_deferred_split_queue(struct folio *folio) -{ - struct mem_cgroup *memcg = folio_memcg(folio); - struct pglist_data *pgdat = NODE_DATA(folio_nid(folio)); - - if (memcg) - return &memcg->deferred_split_queue; - else - return &pgdat->deferred_split_queue; -} -#else -static inline -struct deferred_split *get_deferred_split_queue(struct folio *folio) -{ - struct pglist_data *pgdat = NODE_DATA(folio_nid(folio)); - - return &pgdat->deferred_split_queue; -} -#endif - void folio_prep_large_rmappable(struct folio *folio) { if (!folio || !folio_test_large(folio)) --- a/mm/internal.h~mm-migrate-split-source-folio-if-it-is-on-deferred-split-list +++ a/mm/internal.h @@ -1106,6 +1106,29 @@ struct page *follow_trans_huge_pmd(struc unsigned long addr, pmd_t *pmd, unsigned int flags); +#ifdef CONFIG_MEMCG +static inline +struct deferred_split *get_deferred_split_queue(struct folio *folio) +{ + struct mem_cgroup *memcg = folio_memcg(folio); + struct pglist_data *pgdat = NODE_DATA(folio_nid(folio)); + + if (memcg) + return &memcg->deferred_split_queue; + else + return &pgdat->deferred_split_queue; +} +#else +static inline +struct deferred_split *get_deferred_split_queue(struct folio *folio) +{ + struct pglist_data *pgdat = NODE_DATA(folio_nid(folio)); + + return &pgdat->deferred_split_queue; +} +#endif + + /* * mm/mmap.c */ --- a/mm/migrate.c~mm-migrate-split-source-folio-if-it-is-on-deferred-split-list +++ a/mm/migrate.c @@ -1654,25 +1654,63 @@ static int migrate_pages_batch(struct li /* * Large folio migration might be unsupported or - * the allocation might be failed so we should retry - * on the same folio with the large folio split + * the folio is on deferred split list so we should + * retry on the same folio with the large folio split * to normal folios. * * Split folios are put in split_folios, and * we will migrate them after the rest of the * list is processed. */ - if (!thp_migration_supported() && is_thp) { - nr_failed++; - stats->nr_thp_failed++; - if (!try_split_folio(folio, split_folios)) { - stats->nr_thp_split++; - stats->nr_split++; + if (is_thp) { + bool is_on_deferred_list = false; + + /* + * Check without taking split_queue_lock to + * reduce locking overheads. The worst case is + * that if the folio is put on the deferred + * split list after the check, it will be + * migrated and not put back on the list. + * The migrated folio will not be split + * via shrinker during memory pressure. + */ + if (!data_race(list_empty(&folio->_deferred_list))) { + struct deferred_split *ds_queue; + unsigned long flags; + + ds_queue = + get_deferred_split_queue(folio); + spin_lock_irqsave(&ds_queue->split_queue_lock, + flags); + /* + * Only check if the folio is on + * deferred split list without removing + * it. Since the folio can be on + * deferred_split_scan() local list and + * removing it can cause the local list + * corruption. Folio split process + * below can handle it with the help of + * folio_ref_freeze(). + */ + is_on_deferred_list = + !list_empty(&folio->_deferred_list); + spin_unlock_irqrestore(&ds_queue->split_queue_lock, + flags); + } + if (!thp_migration_supported() || + is_on_deferred_list) { + nr_failed++; + stats->nr_thp_failed++; + if (!try_split_folio(folio, + split_folios)) { + stats->nr_thp_split++; + stats->nr_split++; + continue; + } + stats->nr_failed_pages += nr_pages; + list_move_tail(&folio->lru, ret_folios); continue; } - stats->nr_failed_pages += nr_pages; - list_move_tail(&folio->lru, ret_folios); - continue; } rc = migrate_folio_unmap(get_new_folio, put_new_folio, _ Patches currently in -mm which might be from ziy@xxxxxxxxxx are mm-migrate-split-source-folio-if-it-is-on-deferred-split-list.patch