There are cases when we try to pin a folio but discover that it has not been faulted-in. So, we try to allocate it in memfd_alloc_folio() but there is a chance that we might encounter a crash/failure (VM_BUG_ON(!h->resv_huge_pages)) if there are no active reservations at that instant. This issue was reported by syzbot: kernel BUG at mm/hugetlb.c:2403! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 0 UID: 0 PID: 5315 Comm: syz.0.0 Not tainted 6.13.0-rc5-syzkaller-00161-g63676eefb7a0 #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:alloc_hugetlb_folio_reserve+0xbc/0xc0 mm/hugetlb.c:2403 Code: 1f eb 05 e8 56 18 a0 ff 48 c7 c7 40 56 61 8e e8 ba 21 cc 09 4c 89 f0 5b 41 5c 41 5e 41 5f 5d c3 cc cc cc cc e8 35 18 a0 ff 90 <0f> 0b 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f RSP: 0018:ffffc9000d3d77f8 EFLAGS: 00010087 RAX: ffffffff81ff6beb RBX: 0000000000000000 RCX: 0000000000100000 RDX: ffffc9000e51a000 RSI: 00000000000003ec RDI: 00000000000003ed RBP: 1ffffffff34810d9 R08: ffffffff81ff6ba3 R09: 1ffffd4000093005 R10: dffffc0000000000 R11: fffff94000093006 R12: dffffc0000000000 R13: dffffc0000000000 R14: ffffea0000498000 R15: ffffffff9a4086c8 FS: 00007f77ac12e6c0(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f77ab54b170 CR3: 0000000040b70000 CR4: 0000000000352ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> memfd_alloc_folio+0x1bd/0x370 mm/memfd.c:88 memfd_pin_folios+0xf10/0x1570 mm/gup.c:3750 udmabuf_pin_folios drivers/dma-buf/udmabuf.c:346 [inline] udmabuf_create+0x70e/0x10c0 drivers/dma-buf/udmabuf.c:443 udmabuf_ioctl_create drivers/dma-buf/udmabuf.c:495 [inline] udmabuf_ioctl+0x301/0x4e0 drivers/dma-buf/udmabuf.c:526 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:906 [inline] __se_sys_ioctl+0xf5/0x170 fs/ioctl.c:892 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Therefore, to avoid this situation and fix this issue, we just need to make a reservation before we try to allocate the folio. While at it, also remove the VM_BUG_ON() as there is no need to crash the system in this scenario and instead we could just fail the allocation. Fixes: 26a8ea80929c ("mm/hugetlb: fix memfd_pin_folios resv_huge_pages leak") Reported-by: syzbot+a504cb5bae4fe117ba94@xxxxxxxxxxxxxxxxxxxxxxxxx Signed-off-by: Vivek Kasireddy <vivek.kasireddy@xxxxxxxxx> Cc: Steve Sistare <steven.sistare@xxxxxxxxxx> Cc: Muchun Song <muchun.song@xxxxxxxxx> Cc: David Hildenbrand <david@xxxxxxxxxx> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/hugetlb.c | 9 ++++++--- mm/memfd.c | 5 +++++ 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c498874a7170..e46c461210a4 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2397,12 +2397,15 @@ struct folio *alloc_hugetlb_folio_reserve(struct hstate *h, int preferred_nid, struct folio *folio; spin_lock_irq(&hugetlb_lock); + if (!h->resv_huge_pages) { + spin_unlock_irq(&hugetlb_lock); + return NULL; + } + folio = dequeue_hugetlb_folio_nodemask(h, gfp_mask, preferred_nid, nmask); - if (folio) { - VM_BUG_ON(!h->resv_huge_pages); + if (folio) h->resv_huge_pages--; - } spin_unlock_irq(&hugetlb_lock); return folio; diff --git a/mm/memfd.c b/mm/memfd.c index 35a370d75c9a..a3012c444285 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -85,6 +85,10 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx) gfp_mask &= ~(__GFP_HIGHMEM | __GFP_MOVABLE); idx >>= huge_page_order(h); + if (!hugetlb_reserve_pages(file_inode(memfd), + idx, idx + 1, NULL, 0)) + return ERR_PTR(-ENOMEM); + folio = alloc_hugetlb_folio_reserve(h, numa_node_id(), NULL, @@ -100,6 +104,7 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx) folio_unlock(folio); return folio; } + hugetlb_unreserve_pages(file_inode(memfd), idx, idx + 1, 1); return ERR_PTR(-ENOMEM); } #endif -- 2.47.1