On Thu, 2024-01-25 at 11:28 -0500, Chuck Lever wrote: > From: Chuck Lever <chuck.lever@xxxxxxxxxx> > > I noticed that once an NFSv4.1 callback operation gets a > NFS4ERR_DELAY status on CB_SEQUENCE and then the connection is lost, > the callback client loops, resending it indefinitely. > > The switch arm in nfsd4_cb_sequence_done() that handles > NFS4ERR_DELAY uses rpc_restart_call() to rearm the RPC state machine > for the retransmit, but that path does not call the rpc_prepare_call > callback again. Thus cb_seq_status is set to -10008 by the first > NFS4ERR_DELAY result, but is never set back to 1 for the retransmits. > > nfsd4_cb_sequence_done() thinks it's getting nothing but a > long series of CB_SEQUENCE NFS4ERR_DELAY replies. > > Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors") > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > --- > fs/nfsd/nfs4callback.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c > index 926c29879c6a..43b0a34a5d5b 100644 > --- a/fs/nfsd/nfs4callback.c > +++ b/fs/nfsd/nfs4callback.c > @@ -1178,6 +1178,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback > ret = false; > break; > case -NFS4ERR_DELAY: > + cb->cb_seq_status = 1; > if (!rpc_restart_call(task)) > goto out; > > > > Nice catch! Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>