On Mon, Dec 18, 2023 at 10:46:22PM +0100, Guillaume Morin wrote: > Hello Chuck, > > I believe commit 5f7fc5d "SUNRPC: Resupply rq_pages from node-local memory" in > Linux 6.5+ is incorrect. It passes unconditionnaly rq_pool->sp_id as the NUMA > node. > > While the comment in the svc_pool declaration in sunrpc/svc.h says that > sp_id is also the NUMA node id, it might not be the case if the svc is > created using svc_create_pooled(). svc_created_pooled() can use the > per-cpu pool mode therefore in this case sp_id would be the cpu id. > > from __svc_create: > for (i = 0; i < serv->sv_nrpools; i++) { > struct svc_pool *pool = &serv->sv_pools[i]; > > dprintk("svc: initialising pool %u for %s\n", > i, serv->sv_name); > > pool->sp_id = i; > > When using the cpu-mode, this triggers a BUG on my machine: > BUG: unable to handle page fault for address: 0000000000002088 > > #7 [ffffafa3dc42fc90] asm_exc_page_fault at ffffffffa3e00bc7 > [exception RIP: __next_zones_zonelist+9] > RIP: ffffffffa32fbbc9 RSP: ffffafa3dc42fd48 RFLAGS: 00010286 > RAX: 0000000000002080 RBX: 0000000000000000 RCX: ffff8ba5f22bafc0 > RDX: ffff8ba5f22bafc0 RSI: 0000000000000002 RDI: 0000000000002080 > RBP: ffffafa3dc42fdc0 R8: 0000000000002080 R9: ffff8ba62138c2d8 > R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000cc0 > R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000001 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #8 [ffffafa3dc42fd50] __alloc_pages at ffffffffa334c122 > #9 [ffffafa3dc42fdc8] __alloc_pages_bulk at ffffffffa334c519 > #10 [ffffafa3dc42fe58] svc_alloc_arg at ffffffffc0afc0d7 [sunrpc] > #11 [ffffafa3dc42fea0] svc_recv at ffffffffc0afe08d [sunrpc] > #12 [ffffafa3dc42fec8] nfsd at ffffffffc0dec469 [nfsd] > #13 [ffffafa3dc42fee8] kthread at ffffffffa30e4826 > > I believe the fix is to expose svc_pool_map_get_node() and use that in > the alloc_pages_bulk_array_node() call in svx_xprt.c. Reverting 5f7fc5d > would obviously work as well. > > The comment in svc.h should probably be updated as well since it's misleading. > > I didn't provide a patch because I wasn't quite sure which approach you would > prefer but could provide one if that's helpful. Reverted and applied for v6.7-rc (see my nfsd-fixes branch). Thanks for the report and analysis! -- Chuck Lever