Jeff Layton writes via Kernel.org Bugzilla: There is another scenario that could explain a hang here. From nfsd4_cb_sequence_done(): ------------------8<--------------------- case -NFS4ERR_BADSLOT: goto retry_nowait; case -NFS4ERR_SEQ_MISORDERED: if (session->se_cb_seq_nr[cb->cb_held_slot] != 1) { session->se_cb_seq_nr[cb->cb_held_slot] = 1; goto retry_nowait; } break; default: nfsd4_mark_cb_fault(cb->cb_clp); } trace_nfsd_cb_free_slot(task, cb); nfsd41_cb_release_slot(cb); if (RPC_SIGNALLED(task)) goto need_restart; out: return ret; retry_nowait: if (rpc_restart_call_prepare(task)) ret = false; goto out; ------------------8<--------------------- Since it doesn't check RPC_SIGNALLED in the v4.1+ case until very late in the function, it's possible to get a BADSLOT or SEQ_MISORDERED error that causes the callback client to immediately resubmit the rpc_task to the RPC engine without resubmitting to the callback workqueue. I think that we should assume that when RPC_SIGNALLED returns true that the result is suspect, and that we should halt further processing into the CB_SEQUENCE response and restart the callback. View: https://bugzilla.kernel.org/show_bug.cgi?id=219710#c18 You can reply to this message to join the discussion. -- Deet-doot-dot, I am a bot. Kernel.org Bugzilla (bugspray 0.1-dev)