Hi Oscar, > > On Tue, Jun 11, 2024 at 07:52:06PM +0200, Oscar Salvador wrote: > > On Tue, Jun 11, 2024 at 07:46:33PM +0200, Oscar Salvador wrote: > > > On Tue, Jun 11, 2024 at 10:30:05AM -0700, Andrew Morton wrote: > > > > On Tue, 11 Jun 2024 03:34:25 -0700 syzbot > <syzbot+569ed13f4054f271087b@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > > Hello, > > > > > > > > > > syzbot found the following issue on: > > > > > > > > Thanks. > > > > > > > > > Call Trace: > > > > > <TASK> > > > > > alloc_hugetlb_folio_nodemask+0xae/0x3f0 mm/hugetlb.c:2603 > > > > > memfd_alloc_folio+0x15e/0x390 mm/memfd.c:75 > > > > > memfd_pin_folios+0x1066/0x1720 mm/gup.c:3864 > > > > > udmabuf_create+0x658/0x11c0 drivers/dma-buf/udmabuf.c:353 > > > > > udmabuf_ioctl_create drivers/dma-buf/udmabuf.c:420 [inline] > > > > > udmabuf_ioctl+0x304/0x4f0 drivers/dma-buf/udmabuf.c:451 > > > > > vfs_ioctl fs/ioctl.c:51 [inline] > > > > > __do_sys_ioctl fs/ioctl.c:907 [inline] > > > > > __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893 > > > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > > > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 > > > > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > > > > > > > I think we can pretty confidently point at the series "mm/gup: > > > > Introduce memfd_pin_folios() for pinning memfd folios". I'll drop the > > > > v14 series. > > > > > > jfyi: I am trying to reproduce this locally. > > > > Actually, should not memfd_alloc_folio() pass htlb_alloc_mask() instead > > of GFP_USER to alloc_hugetlb_folio_nodemask? Or at least do > > GFP_HIGHUSER. > > Ok, I spot the issue. > memfd_alloc_folio() was calling alloc_hugetlb_folio_nodemask with > preferred_nid being NUMA_NO_NODE, but that is bad as > dequeue_hugetlb_folio_nodemask will do: > > zonelist = node_zonelist(nid, gfp_mask) > > which will try to get node_zonelists from nid, but since nid is -1, heh. > > The below patch fixes the issue for me, but I think that the right place > to fix this up would be alloc_hugetlb_folio_nodemask(), so we can place > the numa_node_id() if preferred_nid = NUMA_NO_NODE in there as a safety > net. > This way we catch this before exploding in case the user was not careful > enough. > > I will cook up a patch shortly. Thank you for fixing this issue! > > Another thing is why memfd_alloc_folio uses GFP_USER instead of > GFP_HIGHUSER, but that maybe because I see that memfd_pin_folios() is > used by some DMA driver which might not have access to HIGH_MEMORY. Right, memfd_pin_folios() is used by udmabuf driver for DMA but the reason why GFP_USER is chosen is because I was following this code in gup.c: struct migration_target_control mtc = { .nid = NUMA_NO_NODE, .gfp_mask = GFP_USER | __GFP_NOWARN, .reason = MR_LONGTERM_PIN, }; if (migrate_pages(movable_folio_list, alloc_migration_target, NULL, (unsigned long)&mtc, MIGRATE_SYNC, MR_LONGTERM_PIN, NULL)) { where, alloc_migration_target() does the following to allocate a hugetlb folio: if (folio_test_hugetlb(src)) { struct hstate *h = folio_hstate(src); gfp_mask = htlb_modify_alloc_mask(h, gfp_mask); return alloc_hugetlb_folio_nodemask(h, nid, mtc->nmask, gfp_mask, htlb_allow_alloc_fallback(mtc->reason)); but I somehow missed the early check in alloc_migration_target() where it does: if (nid == NUMA_NO_NODE) nid = folio_nid(src); Thanks, Vivek > > diff --git a/mm/memfd.c b/mm/memfd.c > index 8035c6325e3c..2692f0298adc 100644 > --- a/mm/memfd.c > +++ b/mm/memfd.c > @@ -68,12 +68,13 @@ static void memfd_tag_pins(struct xa_state *xas) > struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx) > { > #ifdef CONFIG_HUGETLB_PAGE > + int nid = numa_node_id(); > struct folio *folio; > int err; > > if (is_file_hugepages(memfd)) { > folio = alloc_hugetlb_folio_nodemask(hstate_file(memfd), > - NUMA_NO_NODE, > + nid, > NULL, > GFP_USER, > false); > > > > > > > -- > > Oscar Salvador > > SUSE Labs > > > > -- > Oscar Salvador > SUSE Labs