> On Oct 21, 2024, at 2:26 AM, Bharath SM <bharathsm.hsk@xxxxxxxxx> wrote: > > We are working on a prototype of NFS v4.1 server migration and during > our testing we are noticing the following crashes with NFS clients > during the movement of traffic from Server-1 to Server-2. > Tested on Ubuntu distro with 5.15, 6.5 and 6.8 Linux kernel versions > and observed following crashes while accessing invalid xprt > structures. Can you please take a look at the issue and my findings so > far and suggest next steps? > Also please let me know if you would need additional information. Removed Bill Baker's email: he's no longer at Oracle. This week we are all at an NFS plug-fest. Response will be slow on these deeper, more detailed issues. > 1. Crash call stack: > > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253541] RIP: > 0010:xprt_reserve+0x3c/0xd0 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253601] Code: bf b8 00 00 > 00 00 c7 47 04 00 00 00 00 4c 8b af a8 00 00 00 74 0d 5b 41 5c 41 5d > 41 5e 5d c3 cc cc cc cc c7 47 04 f5 ff ff ff <49> 8b 85 08 04 00 00 49 > 89 fc f6 c4 02 75 28 49 8b > 45 08 4c 89 e6 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253603] RSP: > 0018:ff5d77a18d147b58 EFLAGS: 00010246 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253606] RAX: > 0000000000000000 RBX: ff1ef854fc4b0000 RCX: 0000000000000000 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253607] RDX: > 0000000000000000 RSI: 0000000000000000 RDI: ff1ef854d2191d00 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253608] RBP: > ff5d77a18d147b78 R08: ff1ef854d2191d30 R09: ffffffff8fa071a0 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253609] R10: > ff1ef854d2191d00 R11: 0000000000000076 R12: ff1ef854d2191d00 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253611] R13: > 0000000000000000 R14: 0000000000000000 R15: ffffffffc0d2e5e0 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253613] FS: > 0000000000000000(0000) GS:ff1ef8586fc40000(0000) > knlGS:0000000000000000 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253615] CS: 0010 DS: 0000 > ES: 0000 CR0: 0000000080050033 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253617] CR2: > 0000000000000408 CR3: 00000003ce43a005 CR4: 0000000000371ee0 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253618] DR0: > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253619] DR3: > 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253620] Call Trace: > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253621] <TASK> > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253623] ? show_regs+0x6a/0x80 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253628] ? __die+0x25/0x70 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253630] ? > page_fault_oops+0x79/0x180 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253634] ? > do_user_addr_fault+0x320/0x660 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253636] ? vsnprintf+0x37d/0x550 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253640] ? exc_page_fault+0x74/0x160 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253644] ? > asm_exc_page_fault+0x27/0x30 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253648] ? > __pfx_call_reserve+0x10/0x10 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253688] ? > xprt_reserve+0x3c/0xd0 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253730] > call_reserve+0x1d/0x30 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253773] > __rpc_execute+0xc1/0x2e0 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253830] > rpc_execute+0xbb/0xf0 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253885] > rpc_run_task+0x12e/0x190 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253926] > rpc_call_sync+0x51/0xb0 [sunrpc] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.253968] > _nfs4_proc_create_session+0x17a/0x370 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254020] ? vprintk_default+0x1d/0x30 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254024] ? vprintk+0x3c/0x70 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254026] ? _printk+0x58/0x80 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254029] > nfs4_proc_create_session+0x67/0x130 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254065] ? > nfs4_proc_create_session+0x67/0x130 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254100] ? > __pfx_nfs4_test_session_trunk+0x10/0x10 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254133] > nfs4_reset_session+0xb2/0x1a0 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254170] > nfs4_state_manager+0x3cc/0x950 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254206] ? > kernel_sigaction+0x79/0x110 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254209] ? > __pfx_nfs4_run_state_manager+0x10/0x10 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254244] > nfs4_run_state_manager+0x64/0x170 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254280] ? > __pfx_nfs4_run_state_manager+0x10/0x10 [nfsv4] > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254314] kthread+0xd6/0x100 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254317] ? __pfx_kthread+0x10/0x10 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254320] ret_from_fork+0x3c/0x60 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254325] ? __pfx_kthread+0x10/0x10 > Jul 8 19:35:38 Ubuntu2204-1 kernel: [ 4453.254328] ret_from_fork_asm > > I looked at this crash and noticed following: > > During NFS migration, when traffic starts going to server 2, we do see > BAD SESSION errors and the reason for the bad session is that the > client sends an old session ID to server-2 for some of the requests > even though the same session has been destroyed in server-1 along with > its client ID just few 100's milliseconds back. (Not sure if there is > some caching playing a role here or if session draining is not > happening cleanly). > Crash is happening on the client because, in response to a BAD SESSION > error from the server, the client tries to attempt the session > recovery and fails to get the transport for a rpc request, resulting > in a crash. (accessing already freed up structures?) > > Snippet from trace-cmd logs: > > Client ID was being released here: > lxfilesstress-3416 [005] 5154.980981: rpc_clnt_shutdown: client=00000000 > lxfilesstress-3416 [005] 5154.980982: rpc_clnt_release: client=00000000 > > Just after a few ms, the same client ID is being used here for session > recovery operation after it has been freed up earlier: > 20.150.9ß/I\^L;l-22349 [006] 5155.042037: rpc_request: > task:00000011@00000000 nfsv4 CREATE_SESSION (sync) > 20.150.9ß/I\^L;l-22349 [006] 5155.042040: rpc_task_run_action: > task:00000011@00000000 > flags=DYNAMIC|NO_ROUND_ROBIN|SOFT|TIMEOUT|NORTO|CRED_NOREF > runstate=RUNNING|ACTIVE status=0 action=call_reserve ------> Crash > after this in call_reserve. > > > Client code: > > void xprt_reserve(struct rpc_task *task) > { > struct rpc_xprt *xprt = task->tk_xprt; > --------------------<<<<<<<<<<<<<<<< tk_xprt pointer is NULL > task->tk_status = 0; > if (task->tk_rqstp != NULL) > return; > task->tk_status = -EAGAIN; > if (!xprt_throttle_congested(xprt, task)) { > xprt_do_reserve(xprt, task); > } > } > > tk_xprt is NULL in above function because, during session recovery rpc > request, sunrpc module tries to get xprt using > ‘rpc_task_get_first_xprt(clnt)’ for a given client but it this > function returns NULL. > > static void rpc_task_set_transport(struct rpc_task *task, struct rpc_clnt *clnt) > { > if (task->tk_xprt) { > if (!(test_bit(XPRT_OFFLINE, &task->tk_xprt->state) && > (task->tk_flags & RPC_TASK_MOVEABLE))) > return; > xprt_release(task); > xprt_put(task->tk_xprt); > } > > if (task->tk_flags & RPC_TASK_NO_ROUND_ROBIN) > task->tk_xprt = rpc_task_get_first_xprt(clnt); > ------------------------------- Returns NULL > else > task->tk_xprt = rpc_task_get_next_xprt(clnt); > > } > > > We also noticed that, In nfs4_update_server after nfs4_set_client > call, we aren't calling nfs_init_server_rpcclient() whereas in all > other cases, after nfs4_set_clien we always called > nfs_init_server_rpcclient(). Can you please let me know if this is > expected.? > > > 2. Another crash call stack with xprt structures being NULL just after > migration. > > PID: 306169 TASK: ff4a4584198c8000 CPU: 6 COMMAND: "kworker/u16:0" > #0 [ff5630b9cb397a10] machine_kexec at ffffffffb9e95ac2 > #1 [ff5630b9cb397a70] __crash_kexec at ffffffffb9fe022f > #2 [ff5630b9cb397b38] panic at ffffffffb9ed62ce > #3 [ff5630b9cb397bb8] oops_end at ffffffffb9e405a7 > #4 [ff5630b9cb397be0] page_fault_oops at ffffffffb9eaa482 > #5 [ff5630b9cb397c68] do_user_addr_fault at ffffffffb9eab090 > #6 [ff5630b9cb397cb0] exc_page_fault at ffffffffbadb1884 > #7 [ff5630b9cb397ce0] asm_exc_page_fault at ffffffffbae00bc7 > [exception RIP: xprt_release+317] > RIP: ffffffffc0522afd RSP: ff5630b9cb397d98 RFLAGS: 00010286 > RAX: 0000000000000045 RBX: ff4a4584198c8000 RCX: 0000000000000027 > RDX: 0000000000000000 RSI: 0000000000000001 RDI: ff4a45840a573d00 > RBP: ff5630b9cb397db8 R8: 000000007867a738 R9: 0000000000000020 > R10: 0000000000ffff10 R11: 000000000000000f R12: ff4a45840a573d00 > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #8 [ff5630b9cb397dc0] rpc_release_resources_task at ffffffffc0533593 [sunrpc] > #9 [ff5630b9cb397dd8] __rpc_execute at ffffffffc053d3ff [sunrpc] > #10 [ff5630b9cb397e38] rpc_async_schedule at ffffffffc053d830 [sunrpc] > #11 [ff5630b9cb397e58] process_one_work at ffffffffb9efd14e > #12 [ff5630b9cb397ea0] worker_thread at ffffffffb9efd3a0 > #13 [ff5630b9cb397ee8] kthread at ffffffffb9f06dd6 > #14 [ff5630b9cb397f28] ret_from_fork at ffffffffb9e4c62c > #15 [ff5630b9cb397f50] ret_from_fork_asm at ffffffffb9e038db > > > Thanks, > Bharath -- Chuck Lever