Re: SUNRPC: crash from svc_alloc_arg()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 18, 2023 at 10:46:22PM +0100, Guillaume Morin wrote:
> Hello Chuck,
> 
> I believe commit 5f7fc5d "SUNRPC: Resupply rq_pages from node-local memory" in
> Linux 6.5+ is incorrect. It passes unconditionnaly rq_pool->sp_id as the NUMA
> node.
> 
> While the comment in the svc_pool declaration in sunrpc/svc.h says that
> sp_id is also the NUMA node id, it might not be the case if the svc is
> created using svc_create_pooled(). svc_created_pooled() can use the
> per-cpu pool mode therefore in this case sp_id would be the cpu id.
> 
> from __svc_create:
> 	for (i = 0; i < serv->sv_nrpools; i++) {
> 		struct svc_pool *pool = &serv->sv_pools[i];
> 
> 		dprintk("svc: initialising pool %u for %s\n",
> 				i, serv->sv_name);
> 
> 		pool->sp_id = i;
> 
> When using the cpu-mode, this triggers a BUG on my machine:
> BUG: unable to handle page fault for address: 0000000000002088
> 
>  #7 [ffffafa3dc42fc90] asm_exc_page_fault at ffffffffa3e00bc7
>     [exception RIP: __next_zones_zonelist+9]
>     RIP: ffffffffa32fbbc9  RSP: ffffafa3dc42fd48  RFLAGS: 00010286
>     RAX: 0000000000002080  RBX: 0000000000000000  RCX: ffff8ba5f22bafc0
>     RDX: ffff8ba5f22bafc0  RSI: 0000000000000002  RDI: 0000000000002080
>     RBP: ffffafa3dc42fdc0   R8: 0000000000002080   R9: ffff8ba62138c2d8
>     R10: 0000000000000001  R11: 0000000000000000  R12: 0000000000000cc0
>     R13: 0000000000000002  R14: 0000000000000000  R15: 0000000000000001
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>  #8 [ffffafa3dc42fd50] __alloc_pages at ffffffffa334c122
>  #9 [ffffafa3dc42fdc8] __alloc_pages_bulk at ffffffffa334c519
> #10 [ffffafa3dc42fe58] svc_alloc_arg at ffffffffc0afc0d7 [sunrpc]
> #11 [ffffafa3dc42fea0] svc_recv at ffffffffc0afe08d [sunrpc]
> #12 [ffffafa3dc42fec8] nfsd at ffffffffc0dec469 [nfsd]
> #13 [ffffafa3dc42fee8] kthread at ffffffffa30e4826
> 
> I believe the fix is to expose svc_pool_map_get_node() and use that in
> the alloc_pages_bulk_array_node() call in svx_xprt.c. Reverting 5f7fc5d
> would obviously work as well.
> 
> The comment in svc.h should probably be updated as well since it's misleading.
> 
> I didn't provide a patch because I wasn't quite sure which approach you would
> prefer but could provide one if that's helpful.

Reverted and applied for v6.7-rc (see my nfsd-fixes branch). Thanks
for the report and analysis!


-- 
Chuck Lever




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux