On 4 Apr 2022, at 11:18, Naoya Horiguchi wrote: > On Mon, Apr 04, 2022 at 10:47:20AM -0400, Zi Yan wrote: >> On 4 Apr 2022, at 10:29, Matthew Wilcox wrote: >> >>> On Mon, Apr 04, 2022 at 10:05:00AM -0400, Zi Yan wrote: >>>> On 4 Apr 2022, at 9:29, Naoya Horiguchi wrote: >>>>> I found that the below VM_BUG_ON_FOLIO is triggered on v5.18-rc1 >>>>> (and also reproducible with mmotm on 3/31). >>>>> I have no idea about the bug's mechanism, but it seems not to be >>>>> shared in LKML yet, so let me just share. config.gz is attached. >>>>> >>>>> [ 48.206424] page:0000000021452e3a refcount:6 mapcount:0 mapping:000000003aaf5253 index:0x0 pfn:0x14e600 >>>>> [ 48.213316] head:0000000021452e3a order:9 compound_mapcount:0 compound_pincount:0 >>>>> [ 48.218830] aops:xfs_address_space_operations [xfs] ino:dee dentry name:"libc.so.6" >>>>> [ 48.225098] flags: 0x57ffffc0012027(locked|referenced|uptodate|active|private|head|node=1|zone=2|lastcpupid=0x1fffff) >>>>> [ 48.232792] raw: 0057ffffc0012027 0000000000000000 dead000000000122 ffff8a0dc9a376b8 >>>>> [ 48.238464] raw: 0000000000000000 ffff8a0dc6b23d20 00000006ffffffff 0000000000000000 >>>>> [ 48.244109] page dumped because: VM_BUG_ON_FOLIO(folio_nr_pages(old) != nr_pages) >>>>> [ 48.249196] ------------[ cut here ]------------ >>>>> [ 48.251240] kernel BUG at mm/memcontrol.c:6857! >>>>> [ 48.260535] RIP: 0010:mem_cgroup_migrate+0x217/0x320 >>>>> [ 48.286942] Call Trace: >>>>> [ 48.287665] <TASK> >>>>> [ 48.288255] iomap_migrate_page+0x64/0x190 >>>>> [ 48.289366] move_to_new_page+0xa3/0x470 >>>> >>>> Is it because migration code assumes all THPs have order=HPAGE_PMD_ORDER? >>>> Would the patch below fix the issue? > > I briefly confirmed that this bug didn't reproduce with your change, > thank you very much! > Thanks. Hi Matthew, I am wondering if my change is the right fix or not. folios with order>0 are still available when CONFIG_TRANSPARENT_HUGEPAGE is not set, right? Then, PageTransHuge always returns false and the VM_BUG will still be triggered, since there is no code to allocate folios with order>0. Maybe the patch below could cover !CONFIG_TRANSPARENT_HUGEPAGE too? diff --git a/mm/mempolicy.c b/mm/mempolicy.c index a2516d31db6c..6e60b5c4b565 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1209,7 +1209,7 @@ static struct page *new_page(struct page *page, unsigned long start) struct page *thp; thp = alloc_hugepage_vma(GFP_TRANSHUGE, vma, address, - HPAGE_PMD_ORDER); + thp_order(page)); if (!thp) return NULL; prep_transhuge_page(thp); @@ -1218,8 +1218,8 @@ static struct page *new_page(struct page *page, unsigned long start) /* * if !vma, alloc_page_vma() will use task or system default policy */ - return alloc_page_vma(GFP_HIGHUSER_MOVABLE | __GFP_RETRY_MAYFAIL, - vma, address); + return alloc_pages_vma(GFP_HIGHUSER_MOVABLE | __GFP_RETRY_MAYFAIL, + folio_order(page_folio(page), vma, address); } #else diff --git a/mm/migrate.c b/mm/migrate.c index de175e2fdba5..b079605854d7 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1522,7 +1522,7 @@ struct page *alloc_migration_target(struct page *page, unsigned long private) { struct migration_target_control *mtc; gfp_t gfp_mask; - unsigned int order = 0; + unsigned int order = folio_order(page_folio(page)); struct page *new_page = NULL; int nid; int zidx; @@ -1547,7 +1547,7 @@ struct page *alloc_migration_target(struct page *page, unsigned long private) */ gfp_mask &= ~__GFP_RECLAIM; gfp_mask |= GFP_TRANSHUGE; - order = HPAGE_PMD_ORDER; + order = thp_order(page); } zidx = zone_idx(page_zone(page)); if (is_highmem_idx(zidx) || zidx == ZONE_MOVABLE) >>> >>> This looks entirely plausible to me! I do have changes in this area, >>> but clearly I should have submitted them earlier. Let's get these fixes >>> in as they are. >>> >>> Is there a test suite that tests page migration? I usually use xfstests >>> and it does no page migration at all (at least 'git grep migrate' >>> finds nothing useful). >> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flinux-test-project%2Fltp&data=04%7C01%7Cziy%40nvidia.com%7Cec512f5a763543d4f99608da164e5413%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637846822934713102%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Ig%2Ba4GEkks5vShdpfX8RSX5csCTKq3dmtaOqjpOmelk%3D&reserved=0 has some migrate_pages and move_pages >> tests. You can run them after install ltp: >> sudo ./runltp -f syscalls -s migrate_pages and >> sudo ./runltp -f sys calls -s move_pages -- Best Regards, Yan, Zi
Attachment:
signature.asc
Description: OpenPGP digital signature