On Mon, 2024-10-28 at 13:27 -0400, Jeff Layton wrote: > On Mon, 2024-10-28 at 08:46 -0400, Tom Talpey wrote: > > On 10/28/2024 5:18 AM, Benoît Gschwind wrote: > > > Hello, > > > > > > The issue trigger again, I attached the result of: > > > > > > # dmesg -W | tee dmesg.txt > > > > > > using: > > > > > > # echo t > /proc/sysrq-trigger > > > > > > I have the following PID stuck: > > > > > > 1474 D (disk sleep) 0:54:58.602 [nfsd] > > > 1475 D (disk sleep) 0:54:58.602 [nfsd] > > > 1484 D (disk sleep) 0:54:58.602 [nfsd] > > > 1495 D (disk sleep) 0:54:58.602 [nfsd] > > > > Hmm, 1495 is stuck in nfsd4_create_session > > > > > [427468.304955] task:nfsd state:D stack:0 pid:1495 > > ppid:2 flags:0x00004000 > > > [427468.304962] Call Trace: > > > [427468.304965] <TASK> > > > [427468.304971] __schedule+0x34d/0x9e0 > > > [427468.304983] schedule+0x5a/0xd0 > > > [427468.304991] schedule_timeout+0x118/0x150 > > > [427468.305003] wait_for_completion+0x86/0x160 > > > [427468.305015] __flush_workqueue+0x152/0x420 > > > [427468.305031] nfsd4_create_session+0x79f/0xba0 [nfsd] > > > [427468.305092] nfsd4_proc_compound+0x34c/0x660 [nfsd] > > > [427468.305147] nfsd_dispatch+0x1a1/0x2b0 [nfsd] > > > [427468.305199] svc_process_common+0x295/0x610 [sunrpc] > > > [427468.305269] ? svc_recv+0x491/0x810 [sunrpc] > > > [427468.305337] ? nfsd_svc+0x370/0x370 [nfsd] > > > [427468.305389] ? nfsd_shutdown_threads+0x90/0x90 [nfsd] > > > [427468.305437] svc_process+0xad/0x100 [sunrpc] > > > [427468.305505] nfsd+0x99/0x140 [nfsd] > > > [427468.305555] kthread+0xda/0x100 > > > [427468.305562] ? kthread_complete_and_exit+0x20/0x20 > > > [427468.305572] ret_from_fork+0x22/0x30 > > > > and the other three are stuck in nfsd4_destroy_session > > > > All 4 processes are stuck waiting on a flush_workqueue call. It's > probably one of these: > > flush_workqueue(clp->cl_callback_wq); > > So the question here is really -- why are the callback workqueue jobs > stuck? I do see this: > > [427468.316839] Workqueue: nfsd4_callbacks nfsd4_run_cb_work [nfsd] > [427468.316899] Call Trace: > [427468.316902] <TASK> > [427468.316908] __schedule+0x34d/0x9e0 > [427468.316919] schedule+0x5a/0xd0 > [427468.316927] schedule_timeout+0x94/0x150 > [427468.316937] ? __bpf_trace_tick_stop+0x10/0x10 > [427468.316947] rpc_shutdown_client+0xf2/0x150 [sunrpc] > [427468.317015] ? cpuusage_read+0x10/0x10 > [427468.317026] nfsd4_process_cb_update+0x4c/0x270 [nfsd] > [427468.317097] nfsd4_run_cb_work+0x9f/0x150 [nfsd] > [427468.317146] process_one_work+0x1c7/0x380 > [427468.317158] worker_thread+0x4d/0x380 > [427468.317170] ? rescuer_thread+0x3a0/0x3a0 > [427468.317177] kthread+0xda/0x100 > [427468.317185] ? kthread_complete_and_exit+0x20/0x20 > [427468.317195] ret_from_fork+0x22/0x30 > [427468.317213] </TASK> > > Maybe the RPC client is having trouble clearing clnt->cl_tasks ? > Interestingly, I hit this today while testing the patches that widen the backchannel: [ 484.550350] INFO: task nfsd:1168 blocked for more than 120 seconds. [ 484.551746] Not tainted 6.12.0-rc5+ #188 [ 484.552854] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 484.586316] task:nfsd state:D stack:0 pid:1168 tgid:1168 ppid:2 flags:0x00004000 [ 484.588417] Call Trace: [ 484.589070] <TASK> [ 484.589624] __schedule+0x4e4/0xb90 [ 484.622306] schedule+0x26/0xf0 [ 484.623162] schedule_timeout+0x121/0x150 [ 484.624344] ? __smp_call_single_queue+0xa7/0x110 [ 484.625375] ? x2apic_send_IPI+0x45/0x50 [ 484.654299] ? ttwu_queue_wakelist+0xbf/0x110 [ 484.655454] __wait_for_common+0x91/0x1c0 [ 484.656372] ? __pfx_schedule_timeout+0x10/0x10 [ 484.657429] __flush_workqueue+0x13a/0x400 [ 484.686304] nfsd4_create_session+0x9cd/0xd90 [nfsd] [ 484.687562] nfsd4_proc_compound+0x396/0x750 [nfsd] [ 484.688746] nfsd_dispatch+0xc6/0x210 [nfsd] [ 484.689835] svc_process_common+0x4e6/0x6d0 [sunrpc] [ 484.722305] ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd] [ 484.723532] svc_process+0x12d/0x1c0 [sunrpc] [ 484.724619] svc_recv+0x7cf/0x980 [sunrpc] [ 484.725618] ? __pfx_nfsd+0x10/0x10 [nfsd] [ 484.758308] nfsd+0x9f/0x100 [nfsd] [ 484.759184] kthread+0xdd/0x110 [ 484.759905] ? __pfx_kthread+0x10/0x10 [ 484.760870] ret_from_fork+0x30/0x50 [ 484.761603] ? __pfx_kthread+0x10/0x10 [ 484.790302] ret_from_fork_asm+0x1a/0x30 [ 484.791151] </TASK> The host was hung after this, so I can't fully confirm the scenario below, but I think it's probably correct: The client might want to create a new session. When that happens, the create_session call tries to flush_workqueue() on the client-wide backchannel workqueue. That workqueue is stuck, as it's waiting for all of the callback RPCs to die off (as evidenced by the stack from Benoît's info that shows it hung in rpc_shutdown_client). So the question I guess at this point is why these RPC tasks aren't dying like they're supposed to when rpc_killall_tasks() is run. I'm guessing that the backchannel-widening patches may make this a bit easier to reproduce? In any case, this problem doesn't seem to be confined to v6.1. I'm still looking at this, so stay tuned... -- Jeff Layton <jlayton@xxxxxxxxxx>