Re: [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes

Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> · Tue, 5 Sep 2023 18:43:02 +0800

On 2023/9/5 17:06, Muchun Song wrote:


On Sep 5, 2023, at 11:13, Yuan Can <yuancan@xxxxxxxxxx> wrote:

The decreasing of hugetlb pages number failed with the following message
given:

sh: page allocation failure: order:0, mode:0x204cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_THISNODE)
CPU: 1 PID: 112 Comm: sh Not tainted 6.5.0-rc7-... #45
Hardware name: linux,dummy-virt (DT)
Call trace:
  dump_backtrace.part.6+0x84/0xe4
  show_stack+0x18/0x24
  dump_stack_lvl+0x48/0x60
  dump_stack+0x18/0x24
  warn_alloc+0x100/0x1bc
  __alloc_pages_slowpath.constprop.107+0xa40/0xad8
  __alloc_pages+0x244/0x2d0
  hugetlb_vmemmap_restore+0x104/0x1e4
  __update_and_free_hugetlb_folio+0x44/0x1f4
  update_and_free_hugetlb_folio+0x20/0x68
  update_and_free_pages_bulk+0x4c/0xac
  set_max_huge_pages+0x198/0x334
  nr_hugepages_store_common+0x118/0x178
  nr_hugepages_store+0x18/0x24
  kobj_attr_store+0x18/0x2c
  sysfs_kf_write+0x40/0x54
  kernfs_fop_write_iter+0x164/0x1dc
  vfs_write+0x3a8/0x460
  ksys_write+0x6c/0x100
  __arm64_sys_write+0x1c/0x28
  invoke_syscall+0x44/0x100
  el0_svc_common.constprop.1+0x6c/0xe4
  do_el0_svc+0x38/0x94
  el0_svc+0x28/0x74
  el0t_64_sync_handler+0xa0/0xc4
  el0t_64_sync+0x174/0x178
Mem-Info:
  ...

The reason is that the hugetlb pages being released are allocated from
movable nodes, and with hugetlb_optimize_vmemmap enabled, vmemmap pages
need to be allocated from the same node during the hugetlb pages

Thanks for your fix, I think it should be a real word issue, it's better
to add a Fixes tag to indicate backporting. Thanks.

releasing. With GFP_KERNEL and __GFP_THISNODE set, allocating from movable
node is always failed. Fix this problem by removing __GFP_THISNODE.

Should be ad2fa3717b74 ("mm: hugetlb: alloc the vmemmap pages associated 
with each HugeTLB page")


Signed-off-by: Yuan Can <yuancan@xxxxxxxxxx>
---
mm/hugetlb_vmemmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index c2007ef5e9b0..0485e471d224 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -386,7 +386,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end,
static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
   				   struct list_head *list)
{
- 	gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE;
+ 	gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL;

There is a little change for non-movable case after this change, we fist try
to allocate memory from the preferred node (it is same as original), if it
fails, it fallbacks to other nodes now. For me, it makes sense. At least, those
huge pages could be freed once other nodes could satisfy the allocation of
vmemmap pages.

Reviewed-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>

Thanks.

	unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
	int nid = page_to_nid((struct page *)start);
	struct page *page, *next;
--
2.17.1