> On Sep 6, 2023, at 08:28, Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > > On 09/05/23 17:06, Muchun Song wrote: >> >> >>> On Sep 5, 2023, at 11:13, Yuan Can <yuancan@xxxxxxxxxx> wrote: >>> >>> The decreasing of hugetlb pages number failed with the following message >>> given: >>> >>> sh: page allocation failure: order:0, mode:0x204cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_THISNODE) >>> CPU: 1 PID: 112 Comm: sh Not tainted 6.5.0-rc7-... #45 >>> Hardware name: linux,dummy-virt (DT) >>> Call trace: >>> dump_backtrace.part.6+0x84/0xe4 >>> show_stack+0x18/0x24 >>> dump_stack_lvl+0x48/0x60 >>> dump_stack+0x18/0x24 >>> warn_alloc+0x100/0x1bc >>> __alloc_pages_slowpath.constprop.107+0xa40/0xad8 >>> __alloc_pages+0x244/0x2d0 >>> hugetlb_vmemmap_restore+0x104/0x1e4 >>> __update_and_free_hugetlb_folio+0x44/0x1f4 >>> update_and_free_hugetlb_folio+0x20/0x68 >>> update_and_free_pages_bulk+0x4c/0xac >>> set_max_huge_pages+0x198/0x334 >>> nr_hugepages_store_common+0x118/0x178 >>> nr_hugepages_store+0x18/0x24 >>> kobj_attr_store+0x18/0x2c >>> sysfs_kf_write+0x40/0x54 >>> kernfs_fop_write_iter+0x164/0x1dc >>> vfs_write+0x3a8/0x460 >>> ksys_write+0x6c/0x100 >>> __arm64_sys_write+0x1c/0x28 >>> invoke_syscall+0x44/0x100 >>> el0_svc_common.constprop.1+0x6c/0xe4 >>> do_el0_svc+0x38/0x94 >>> el0_svc+0x28/0x74 >>> el0t_64_sync_handler+0xa0/0xc4 >>> el0t_64_sync+0x174/0x178 >>> Mem-Info: >>> ... >>> >>> The reason is that the hugetlb pages being released are allocated from >>> movable nodes, and with hugetlb_optimize_vmemmap enabled, vmemmap pages >>> need to be allocated from the same node during the hugetlb pages >> >> Thanks for your fix, I think it should be a real word issue, it's better >> to add a Fixes tag to indicate backporting. Thanks. >> > > I thought we might get get the same error (Unable to allocate on movable > node) when creating the hugetlb page. Why? Because we replace the head > vmemmap page. However, I see that failure to allocate there is not a > fatal error and we fallback to the currently mapped page. We also pass > __GFP_NOWARN to that allocation attempt so there will be no report of the > failure. > > We might want to change this as well? I think yes. I also thought about this yesterday, but I think this one is not a fetal error, it should be an improvement patch. So it is better not to fold this change into this patch (a bug fix one). Thanks. > >>> releasing. With GFP_KERNEL and __GFP_THISNODE set, allocating from movable >>> node is always failed. Fix this problem by removing __GFP_THISNODE. >>> >>> Signed-off-by: Yuan Can <yuancan@xxxxxxxxxx> >>> --- >>> mm/hugetlb_vmemmap.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c >>> index c2007ef5e9b0..0485e471d224 100644 >>> --- a/mm/hugetlb_vmemmap.c >>> +++ b/mm/hugetlb_vmemmap.c >>> @@ -386,7 +386,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end, >>> static int alloc_vmemmap_page_list(unsigned long start, unsigned long end, >>> struct list_head *list) >>> { >>> - gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE; >>> + gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL; >> >> There is a little change for non-movable case after this change, we fist try >> to allocate memory from the preferred node (it is same as original), if it >> fails, it fallbacks to other nodes now. For me, it makes sense. At least, those >> huge pages could be freed once other nodes could satisfy the allocation of >> vmemmap pages. >> >> Reviewed-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> > > This looks reasonable to me as well. > > Cc'ing David and Michal as they are expert in hotplug. > -- > Mike Kravetz > >> >> Thanks. >> >>> unsigned long nr_pages = (end - start) >> PAGE_SHIFT; >>> int nid = page_to_nid((struct page *)start); >>> struct page *page, *next; >>> -- >>> 2.17.1