SUNRPC: crash from svc_alloc_arg()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Chuck,

I believe commit 5f7fc5d "SUNRPC: Resupply rq_pages from node-local memory" in
Linux 6.5+ is incorrect. It passes unconditionnaly rq_pool->sp_id as the NUMA
node.

While the comment in the svc_pool declaration in sunrpc/svc.h says that
sp_id is also the NUMA node id, it might not be the case if the svc is
created using svc_create_pooled(). svc_created_pooled() can use the
per-cpu pool mode therefore in this case sp_id would be the cpu id.

from __svc_create:
	for (i = 0; i < serv->sv_nrpools; i++) {
		struct svc_pool *pool = &serv->sv_pools[i];

		dprintk("svc: initialising pool %u for %s\n",
				i, serv->sv_name);

		pool->sp_id = i;

When using the cpu-mode, this triggers a BUG on my machine:
BUG: unable to handle page fault for address: 0000000000002088

 #7 [ffffafa3dc42fc90] asm_exc_page_fault at ffffffffa3e00bc7
    [exception RIP: __next_zones_zonelist+9]
    RIP: ffffffffa32fbbc9  RSP: ffffafa3dc42fd48  RFLAGS: 00010286
    RAX: 0000000000002080  RBX: 0000000000000000  RCX: ffff8ba5f22bafc0
    RDX: ffff8ba5f22bafc0  RSI: 0000000000000002  RDI: 0000000000002080
    RBP: ffffafa3dc42fdc0   R8: 0000000000002080   R9: ffff8ba62138c2d8
    R10: 0000000000000001  R11: 0000000000000000  R12: 0000000000000cc0
    R13: 0000000000000002  R14: 0000000000000000  R15: 0000000000000001
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffffafa3dc42fd50] __alloc_pages at ffffffffa334c122
 #9 [ffffafa3dc42fdc8] __alloc_pages_bulk at ffffffffa334c519
#10 [ffffafa3dc42fe58] svc_alloc_arg at ffffffffc0afc0d7 [sunrpc]
#11 [ffffafa3dc42fea0] svc_recv at ffffffffc0afe08d [sunrpc]
#12 [ffffafa3dc42fec8] nfsd at ffffffffc0dec469 [nfsd]
#13 [ffffafa3dc42fee8] kthread at ffffffffa30e4826

I believe the fix is to expose svc_pool_map_get_node() and use that in
the alloc_pages_bulk_array_node() call in svx_xprt.c. Reverting 5f7fc5d
would obviously work as well.

The comment in svc.h should probably be updated as well since it's misleading.

I didn't provide a patch because I wasn't quite sure which approach you would
prefer but could provide one if that's helpful.

HTH

Guillaume.

-- 
Guillaume Morin <guillaume@xxxxxxxxxxx>




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux