Thanks for your response. -----Original Message----- From: Muchun Song <muchun.song@xxxxxxxxx> Sent: Monday, July 3, 2023 4:51 PM To: Fumin Gao <Fumin.Gao@xxxxxxxxxxxxxxxxxx> Cc: mike.kravetz@xxxxxxxxxx; Muchun Song <songmuchun@xxxxxxxxxxxxx>; linux-mm@xxxxxxxxx Subject: Re: Report a huge page issue in kernel version v5.19.xx EXTERNAL EMAIL: Do not click links or open attachments unless you know and trust the sender. > On Jul 3, 2023, at 11:07, Fumin Gao <Fumin.Gao@xxxxxxxxxxxxxxxxxx> wrote: > > Hi, What’s the issue? > Recently in our product, I found a issue in kernel version v5.19.xx, this issue was fixed in kernel version v6.xx. > The issue is I can’t get which node the huge page is on by system call move_pages. > How to reproduce this issue? > I attached my test programme file in email. > virtaddr = mmap(NULL, ONE_GIG, PROT_READ | PROT_WRITE, MAP_PRIVATE | > MAP_ANONYMOUS | MAP_HUGETLB , -1, 0); *(char *)virtaddr = 0; if > (syscall(SYS_move_pages, 0, 1, &virtaddr, NULL, &NumaNode, 0) != 0) { > printf("Get virtual address 0x%p on NumaNode failed \n", virtaddr); > } printf("create shared memory with mmap, virtaddr 0x%lx on Node %d, > errno %d \n", virtaddr,NumaNode, errno); When tested with kernel > v5.19.xx , the value of NumaNode is -2 (-ENOENT). > My analysis of this issue. > Based on the following trace and kernel source code, I can see the function calling process. > kernel_move_pages – do_pages_stat – do_pages_stat_array — follow_page > — follow_page_mask — follow_p4d_mask — follow_pud_mask — > follow_huge_pud [001] ..... 510329749178328: sys_move_pages(pid: 0, > nr_pages: 1, pages: 7fffa23a2c90, nodes: 0, status: 7fffa23a2c9c, > flags: 0) [001] ..... 510329749179360: sys_enter: NR 279 (0, 1, > 7fffa23a2c90, 0, 7fffa23a2c9c, 0) [001] ...1. 510329749185448: > mmap_lock_start_locking: mm=00000000e0f35bcd > memcg_path=/user.slice/user-1000.slice/session-1.scope write=false > [001] ...1. 510329749187872: mmap_lock_acquire_returned: > mm=00000000e0f35bcd > memcg_path=/user.slice/user-1000.slice/session-1.scope write=false > success=true [001] ..... 510329749196628: p_follow_page_0: > (follow_page+0x0/0xe0) [001] ..... 510329749199690: > p_vma_is_secretmem_0: (vma_is_secretmem+0x0/0x20) [001] ..... > 510329749202194: p_follow_page_mask_0: (follow_page_mask+0x0/0x160) > [001] ..... 510329749206928: p_follow_huge_addr_0: > (follow_huge_addr+0x0/0x20) [001] ..... 510329749210628: myretprobe: > (follow_page_mask+0x38/0x160 <- follow_huge_addr) > ret=0xffffffffffffffea [001] ..... 510329749216464: > p_follow_pud_mask_isra_0_0: (follow_pud_mask.isra.0+0x0/0x1e0) > [001] ..... 510329749221108: p_follow_huge_pud_0: > (follow_huge_pud+0x0/0x80) [001] ..... 510329749221902: myretprobe: > (follow_pud_mask.isra.0+0x1c8/0x1e0 <- follow_huge_pud) ret=0x0 [001] > ..... 510329749223462: myretprobe: (follow_page_mask+0x147/0x160 <- > follow_pud_mask.isra.0) ret=0x0 [001] ..... 510329749224838: > myretprobe: (do_pages_stat+0x18b/0x330 <- follow_page) ret=0x0 [001] > ...1. 510329749226096: mmap_lock_released: mm=00000000e0f35bcd > memcg_path=/user.slice/user-1000.slice/session-1.scope write=false [001] ..... 510329749228348: sys_move_pages -> 0x0 [001] ..... 510329749229224: sys_exit: NR 279 = 0 In the kernel version v5.19.xx, it add a flag FOLL_GET in do_pages_stat_array compared with v5.18.xx. > page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP); But in > the function follow_huge_pud, if the flags has FOLL_GET, it will > return NULL. This causes we get the status is -ENOENT (-2) in move_pages. > Is my analysis correct ? Correct! If you want v5.19 works properly, you could apply commit 831568214883 ("mm: migration: fix the FOLL_GET failure on following huge page") to fix the issue. Thanks.