On 2023/9/5 17:06, Muchun Song wrote:
On Sep 5, 2023, at 11:13, Yuan Can <yuancan@xxxxxxxxxx> wrote:
The decreasing of hugetlb pages number failed with the following message
given:
sh: page allocation failure: order:0, mode:0x204cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_THISNODE)
CPU: 1 PID: 112 Comm: sh Not tainted 6.5.0-rc7-... #45
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace.part.6+0x84/0xe4
show_stack+0x18/0x24
dump_stack_lvl+0x48/0x60
dump_stack+0x18/0x24
warn_alloc+0x100/0x1bc
__alloc_pages_slowpath.constprop.107+0xa40/0xad8
__alloc_pages+0x244/0x2d0
hugetlb_vmemmap_restore+0x104/0x1e4
__update_and_free_hugetlb_folio+0x44/0x1f4
update_and_free_hugetlb_folio+0x20/0x68
update_and_free_pages_bulk+0x4c/0xac
set_max_huge_pages+0x198/0x334
nr_hugepages_store_common+0x118/0x178
nr_hugepages_store+0x18/0x24
kobj_attr_store+0x18/0x2c
sysfs_kf_write+0x40/0x54
kernfs_fop_write_iter+0x164/0x1dc
vfs_write+0x3a8/0x460
ksys_write+0x6c/0x100
__arm64_sys_write+0x1c/0x28
invoke_syscall+0x44/0x100
el0_svc_common.constprop.1+0x6c/0xe4
do_el0_svc+0x38/0x94
el0_svc+0x28/0x74
el0t_64_sync_handler+0xa0/0xc4
el0t_64_sync+0x174/0x178
Mem-Info:
...
The reason is that the hugetlb pages being released are allocated from
movable nodes, and with hugetlb_optimize_vmemmap enabled, vmemmap pages
need to be allocated from the same node during the hugetlb pages
Thanks for your fix, I think it should be a real word issue, it's better
to add a Fixes tag to indicate backporting. Thanks.
releasing. With GFP_KERNEL and __GFP_THISNODE set, allocating from movable
node is always failed. Fix this problem by removing __GFP_THISNODE.
Should be ad2fa3717b74 ("mm: hugetlb: alloc the vmemmap pages associated
with each HugeTLB page")
Signed-off-by: Yuan Can <yuancan@xxxxxxxxxx>
---
mm/hugetlb_vmemmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index c2007ef5e9b0..0485e471d224 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -386,7 +386,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end,
static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
struct list_head *list)
{
- gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE;
+ gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL;
There is a little change for non-movable case after this change, we fist try
to allocate memory from the preferred node (it is same as original), if it
fails, it fallbacks to other nodes now. For me, it makes sense. At least, those
huge pages could be freed once other nodes could satisfy the allocation of
vmemmap pages.
Reviewed-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>
Thanks.
unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
int nid = page_to_nid((struct page *)start);
struct page *page, *next;
--
2.17.1