On 09/05/23 17:06, Muchun Song wrote: > > > > On Sep 5, 2023, at 11:13, Yuan Can <yuancan@xxxxxxxxxx> wrote: > > > > The decreasing of hugetlb pages number failed with the following message > > given: > > > > sh: page allocation failure: order:0, mode:0x204cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_THISNODE) > > CPU: 1 PID: 112 Comm: sh Not tainted 6.5.0-rc7-... #45 > > Hardware name: linux,dummy-virt (DT) > > Call trace: > > dump_backtrace.part.6+0x84/0xe4 > > show_stack+0x18/0x24 > > dump_stack_lvl+0x48/0x60 > > dump_stack+0x18/0x24 > > warn_alloc+0x100/0x1bc > > __alloc_pages_slowpath.constprop.107+0xa40/0xad8 > > __alloc_pages+0x244/0x2d0 > > hugetlb_vmemmap_restore+0x104/0x1e4 > > __update_and_free_hugetlb_folio+0x44/0x1f4 > > update_and_free_hugetlb_folio+0x20/0x68 > > update_and_free_pages_bulk+0x4c/0xac > > set_max_huge_pages+0x198/0x334 > > nr_hugepages_store_common+0x118/0x178 > > nr_hugepages_store+0x18/0x24 > > kobj_attr_store+0x18/0x2c > > sysfs_kf_write+0x40/0x54 > > kernfs_fop_write_iter+0x164/0x1dc > > vfs_write+0x3a8/0x460 > > ksys_write+0x6c/0x100 > > __arm64_sys_write+0x1c/0x28 > > invoke_syscall+0x44/0x100 > > el0_svc_common.constprop.1+0x6c/0xe4 > > do_el0_svc+0x38/0x94 > > el0_svc+0x28/0x74 > > el0t_64_sync_handler+0xa0/0xc4 > > el0t_64_sync+0x174/0x178 > > Mem-Info: > > ... > > > > The reason is that the hugetlb pages being released are allocated from > > movable nodes, and with hugetlb_optimize_vmemmap enabled, vmemmap pages > > need to be allocated from the same node during the hugetlb pages > > Thanks for your fix, I think it should be a real word issue, it's better > to add a Fixes tag to indicate backporting. Thanks. > I thought we might get get the same error (Unable to allocate on movable node) when creating the hugetlb page. Why? Because we replace the head vmemmap page. However, I see that failure to allocate there is not a fatal error and we fallback to the currently mapped page. We also pass __GFP_NOWARN to that allocation attempt so there will be no report of the failure. We might want to change this as well? > > releasing. With GFP_KERNEL and __GFP_THISNODE set, allocating from movable > > node is always failed. Fix this problem by removing __GFP_THISNODE. > > > > Signed-off-by: Yuan Can <yuancan@xxxxxxxxxx> > > --- > > mm/hugetlb_vmemmap.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c > > index c2007ef5e9b0..0485e471d224 100644 > > --- a/mm/hugetlb_vmemmap.c > > +++ b/mm/hugetlb_vmemmap.c > > @@ -386,7 +386,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end, > > static int alloc_vmemmap_page_list(unsigned long start, unsigned long end, > > struct list_head *list) > > { > > - gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE; > > + gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL; > > There is a little change for non-movable case after this change, we fist try > to allocate memory from the preferred node (it is same as original), if it > fails, it fallbacks to other nodes now. For me, it makes sense. At least, those > huge pages could be freed once other nodes could satisfy the allocation of > vmemmap pages. > > Reviewed-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> This looks reasonable to me as well. Cc'ing David and Michal as they are expert in hotplug. -- Mike Kravetz > > Thanks. > > > unsigned long nr_pages = (end - start) >> PAGE_SHIFT; > > int nid = page_to_nid((struct page *)start); > > struct page *page, *next; > > -- > > 2.17.1