On 1/29/25 2:38 PM, Jeff Layton wrote:
On Wed, 2025-01-29 at 16:39 -0500, Chuck Lever wrote:
On 1/29/25 4:17 PM, Dai Ngo wrote:
If nfs4_client is in COURTESY state then there is no point to retry
the callback. This causes nfsd4_shutdown_callback to hang since
cl_cb_inflight is not 0. This hang lasts about 15 minutes until TCP
notifies NFSD that the connection was closed.
This patch modifies nfsd4_cb_sequence_done to skip the restart the
RPC if nfs4_client is in COURTESY state.
Signed-off-by: Dai Ngo <dai.ngo@xxxxxxxxxx>
---
fs/nfsd/nfs4callback.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 50e468bdb8d4..c90f94898cc5 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1372,6 +1372,11 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
ret = false;
break;
case 1:
+ if (clp->cl_state == NFSD4_COURTESY) {
+ nfsd4_mark_cb_fault(cb->cb_clp);
+ ret = false;
+ break;
+ }
We could do toss this operation here or at the top of
nfsd4_run_cb_work().
Like Olga, I'm wondering if this is the culprit on the recently
reported rpc_shutdown_client hangs.
I'm assuming that with a courtesy client that we don't want to do any
callbacks? If that's the case, then I think doing it early in
nfsd4_run_cb_work() would be better. That would prevent new cb's from
being queued until the cl_state changes too.
I'll move this check into nfsd4_run_cb_work.
Thanks Chuck and Jeff.
-Dai
/*
* cb_seq_status remains 1 if an RPC Reply was never
* received. NFSD can't know if the client processed
Nice catch though!