Jeff Layton writes via Kernel.org Bugzilla: I looked at this with Chuck the other day. As far as wait_var_event() getting stuck, I think all that would take is for nfsd4_cb_sequence_done() to continually set cb_need_restart on every call. That would cause the callback to not be destroyed and to never call nfsd41_cb_inflight_end(). That happens in the need_restart: label in nfsd4_cb_sequence_done. These two cases goto that: if (!clp->cl_minorversion) { /* * If the backchannel connection was shut down while this * task was queued, we need to resubmit it after setting up * a new backchannel connection. * * Note that if we lost our callback connection permanently * the submission code will error out, so we don't need to * handle that case here. */ if (RPC_SIGNALLED(task)) goto need_restart; return true; } if (cb->cb_held_slot < 0) goto need_restart; It doesn't seem likely that it somehow lost the slot, so my guess is that the RPC task is continually returning with RPC_SIGNALLED() set. Question for Baptiste -- what NFS versions are your clients using? View: https://bugzilla.kernel.org/show_bug.cgi?id=219710#c6 You can reply to this message to join the discussion. -- Deet-doot-dot, I am a bot. Kernel.org Bugzilla (bugspray 0.1-dev)