Hi Bharath, On 10/21/24 2:26 AM, Bharath SM wrote: > We are working on a prototype of NFS v4.1 server migration and during > our testing we are noticing the following crashes with NFS clients > during the movement of traffic from Server-1 to Server-2. > Tested on Ubuntu distro with 5.15, 6.5 and 6.8 Linux kernel versions > and observed following crashes while accessing invalid xprt > structures. Can you please take a look at the issue and my findings so > far and suggest next steps? > Also please let me know if you would need additional information. > > 1. Crash call stack: > > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253541] RIP: > 0010:xprt_reserve+0x3c/0xd0 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253601] Code: bf b8 00 00 > 00 00 c7 47 04 00 00 00 00 4c 8b af a8 00 00 00 74 0d 5b 41 5c 41 5d > 41 5e 5d c3 cc cc cc cc c7 47 04 f5 ff ff ff <49> 8b 85 08 04 00 00 49 > 89 fc f6 c4 02 75 28 49 8b > 45 08 4c 89 e6 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253603] RSP: > 0018:ff5d77a18d147b58 EFLAGS: 00010246 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253606] RAX: > 0000000000000000 RBX: ff1ef854fc4b0000 RCX: 0000000000000000 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253607] RDX: > 0000000000000000 RSI: 0000000000000000 RDI: ff1ef854d2191d00 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253608] RBP: > ff5d77a18d147b78 R08: ff1ef854d2191d30 R09: ffffffff8fa071a0 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253609] R10: > ff1ef854d2191d00 R11: 0000000000000076 R12: ff1ef854d2191d00 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253611] R13: > 0000000000000000 R14: 0000000000000000 R15: ffffffffc0d2e5e0 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253613] FS: > 0000000000000000(0000) GS:ff1ef8586fc40000(0000) > knlGS:0000000000000000 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253615] CS: 0010 DS: 0000 > ES: 0000 CR0: 0000000080050033 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253617] CR2: > 0000000000000408 CR3: 00000003ce43a005 CR4: 0000000000371ee0 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253618] DR0: > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253619] DR3: > 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253620] Call Trace: > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253621] <TASK> > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253623] ? show_regs+0x6a/0x80 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253628] ? __die+0x25/0x70 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253630] ? > page_fault_oops+0x79/0x180 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253634] ? > do_user_addr_fault+0x320/0x660 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253636] ? vsnprintf+0x37d/0x550 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253640] ? exc_page_fault+0x74/0x160 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253644] ? > asm_exc_page_fault+0x27/0x30 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253648] ? > __pfx_call_reserve+0x10/0x10 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253688] ? > xprt_reserve+0x3c/0xd0 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253730] > call_reserve+0x1d/0x30 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253773] > __rpc_execute+0xc1/0x2e0 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253830] > rpc_execute+0xbb/0xf0 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253885] > rpc_run_task+0x12e/0x190 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253926] > rpc_call_sync+0x51/0xb0 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253968] > _nfs4_proc_create_session+0x17a/0x370 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254020] ? vprintk_default+0x1d/0x30 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254024] ? vprintk+0x3c/0x70 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254026] ? _printk+0x58/0x80 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254029] > nfs4_proc_create_session+0x67/0x130 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254065] ? > nfs4_proc_create_session+0x67/0x130 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254100] ? > __pfx_nfs4_test_session_trunk+0x10/0x10 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254133] > nfs4_reset_session+0xb2/0x1a0 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254170] > nfs4_state_manager+0x3cc/0x950 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254206] ? > kernel_sigaction+0x79/0x110 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254209] ? > __pfx_nfs4_run_state_manager+0x10/0x10 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254244] > nfs4_run_state_manager+0x64/0x170 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254280] ? > __pfx_nfs4_run_state_manager+0x10/0x10 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254314] kthread+0xd6/0x100 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254317] ? __pfx_kthread+0x10/0x10 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254320] ret_from_fork+0x3c/0x60 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254325] ? __pfx_kthread+0x10/0x10 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254328] ret_from_fork_asm > > I looked at this crash and noticed following: > > During NFS migration, when traffic starts going to server 2, we do see > BAD SESSION errors and the reason for the bad session is that the > client sends an old session ID to server-2 for some of the requests > even though the same session has been destroyed in server-1 along with > its client ID just few 100's milliseconds back. (Not sure if there is > some caching playing a role here or if session draining is not > happening cleanly). > Crash is happening on the client because, in response to a BAD SESSION > error from the server, the client tries to attempt the session > recovery and fails to get the transport for a rpc request, resulting > in a crash. (accessing already freed up structures?) > > Snippet from trace-cmd logs: > > Client ID was being released here: > lxfilesstress-3416 [005] 5154.980981: rpc_clnt_shutdown: client=00000000 > lxfilesstress-3416 [005] 5154.980982: rpc_clnt_release: client=00000000 > > Just after a few ms, the same client ID is being used here for session > recovery operation after it has been freed up earlier: > 20.150.9ß/I\^L;l-22349 [006] 5155.042037: rpc_request: > task:00000011@00000000 nfsv4 CREATE_SESSION (sync) > 20.150.9ß/I\^L;l-22349 [006] 5155.042040: rpc_task_run_action: > task:00000011@00000000 > flags=DYNAMIC|NO_ROUND_ROBIN|SOFT|TIMEOUT|NORTO|CRED_NOREF > runstate=RUNNING|ACTIVE status=0 action=call_reserve ------> Crash > after this in call_reserve. So, the first thing that's sticking out to me in looking through nfs4state.c is that we're setting, testing, and clearing the various state recovery bits outside of a spinlock (and taking the lock right after). This seems like it could potentially open up races, which could be what you're seeing (how often does the crash happen for you, is it every time or only sometimes?). > > > Client code: > > void xprt_reserve(struct rpc_task *task) > { > struct rpc_xprt *xprt = task->tk_xprt; > --------------------<<<<<<<<<<<<<<<< tk_xprt pointer is NULL > task->tk_status = 0; > if (task->tk_rqstp != NULL) > return; > task->tk_status = -EAGAIN; > if (!xprt_throttle_congested(xprt, task)) { > xprt_do_reserve(xprt, task); > } > } > > tk_xprt is NULL in above function because, during session recovery rpc > request, sunrpc module tries to get xprt using > ‘rpc_task_get_first_xprt(clnt)’ for a given client but it this > function returns NULL. > > static void rpc_task_set_transport(struct rpc_task *task, struct rpc_clnt *clnt) > { > if (task->tk_xprt) { > if (!(test_bit(XPRT_OFFLINE, &task->tk_xprt->state) && > (task->tk_flags & RPC_TASK_MOVEABLE))) > return; > xprt_release(task); > xprt_put(task->tk_xprt); > } > > if (task->tk_flags & RPC_TASK_NO_ROUND_ROBIN) > task->tk_xprt = rpc_task_get_first_xprt(clnt); > ------------------------------- Returns NULL > else > task->tk_xprt = rpc_task_get_next_xprt(clnt); > > } > > > We also noticed that, In nfs4_update_server after nfs4_set_client > call, we aren't calling nfs_init_server_rpcclient() whereas in all > other cases, after nfs4_set_clien we always called > nfs_init_server_rpcclient(). Can you please let me know if this is > expected.?