On Fri, Nov 01, 2024 at 08:41:33AM -0400, Jeff Layton wrote: > On Thu, 2024-10-31 at 09:40 -0400, cel@xxxxxxxxxx wrote: > > From: Chuck Lever <chuck.lever@xxxxxxxxxx> > > > > RFC 7862 permits callback services to respond to CB_OFFLOAD with > > NFS4ERR_DELAY. Currently NFSD drops the CB_OFFLOAD in that case. > > > > To improve the reliability of COPY offload, NFSD should rather send > > another CB_OFFLOAD completion notification. > > > > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > > --- > > fs/nfsd/nfs4proc.c | 8 ++++++++ > > fs/nfsd/xdr4.h | 1 + > > 2 files changed, 9 insertions(+) > > > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c > > index 39e90391bce2..0918d05c54a1 100644 > > --- a/fs/nfsd/nfs4proc.c > > +++ b/fs/nfsd/nfs4proc.c > > @@ -1617,6 +1617,13 @@ static int nfsd4_cb_offload_done(struct nfsd4_callback *cb, > > container_of(cb, struct nfsd4_cb_offload, co_cb); > > > > trace_nfsd_cb_offload_done(&cbo->co_res.cb_stateid, task); > > + switch (task->tk_status) { > > + case -NFS4ERR_DELAY: > > + if (cbo->co_retries--) { > > + rpc_delay(task, 1 * HZ); > > + return 0; > > + } > > + } > > return 1; > > Not a comment on your patch specifically, but when we can't send a > callback, should we be trying to log something? A pr_notice() warning? > Conditional tracepoint? I'm not sure of the best way to communicate > this, but it seems like something that admins might want to know. There is a tracepoint here and in every other callback completion handler. I can't think of anything actionable to report -- what would an admin need or want to do in response to a callback failure? There isn't much to do if, for instance, the client doesn't recognize the copy stateid. Also, DELAY is not infrequent and is a common temporary condition. My sense is that the only time you want to see callback failures is when you're looking for something specific. Otherwise, it will generate a lot of low-value noise. > Maybe nfsd needs its own trace buffer that could be scraped with > nfsdctl? The Flight Data Recorder can do that if needed. > > } > > > > @@ -1745,6 +1752,7 @@ static void nfsd4_send_cb_offload(struct nfsd4_copy *copy) > > memcpy(&cbo->co_res, ©->cp_res, sizeof(copy->cp_res)); > > memcpy(&cbo->co_fh, ©->fh, sizeof(copy->fh)); > > cbo->co_nfserr = copy->nfserr; > > + cbo->co_retries = 5; > > > > nfsd4_init_cb(&cbo->co_cb, copy->cp_clp, &nfsd4_cb_offload_ops, > > NFSPROC4_CLNT_CB_OFFLOAD); > > diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h > > index dec29afa43f3..cd2bf63651e3 100644 > > --- a/fs/nfsd/xdr4.h > > +++ b/fs/nfsd/xdr4.h > > @@ -675,6 +675,7 @@ struct nfsd4_cb_offload { > > struct nfsd4_callback co_cb; > > struct nfsd42_write_res co_res; > > __be32 co_nfserr; > > + unsigned int co_retries; > > struct knfsd_fh co_fh; > > }; > > > > -- > Jeff Layton <jlayton@xxxxxxxxxx> -- Chuck Lever