On Tue, 2023-05-09 at 09:41 +1000, NeilBrown wrote: > When an RPC request is deferred, the rq_xprt_ctxt pointer is moved out > of the svc_rqst into the svc_deferred_req. > When the deferred request is revisited, the pointer is copied into > the new svc_rqst - and also remains in the svc_deferred_req. > > In the (rare?) case that the request is deferred a second time, the old > svc_deferred_req is reused - it still has all the correct content. > However in that case the rq_xprt_ctxt pointer is NOT cleared so that > when xpo_release_xprt is called, the ctxt is freed (UDP) or possible > added to a free list (RDMA). > When the deferred request is revisited for a second time, it will > reference this ctxt which may be invalid, and the free the object a > second time which is likely to oops. > I've always found the deferral code to be really hard to follow. The dearth of comments around the design doesn't help either... To be clear, when we call into svc_defer, if rq_deferred is already set, then that means that we're revisiting a request that has already been deferred at least once? > So change svc_defer() to *always* clear rq_xprt_ctxt, and assert that > the value is now stored in the svc_deferred_req. > > Fixes: 773f91b2cf3f ("SUNRPC: Fix NFSD's request deferral on RDMA transports") > Signed-off-by: NeilBrown <neilb@xxxxxxx> > --- > net/sunrpc/svc_xprt.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c > index 84e5d7d31481..5fd94f6bdc75 100644 > --- a/net/sunrpc/svc_xprt.c > +++ b/net/sunrpc/svc_xprt.c > @@ -1223,13 +1223,14 @@ static struct cache_deferred_req *svc_defer(struct cache_req *req) > dr->daddr = rqstp->rq_daddr; > dr->argslen = rqstp->rq_arg.len >> 2; > dr->xprt_ctxt = rqstp->rq_xprt_ctxt; > - rqstp->rq_xprt_ctxt = NULL; > > /* back up head to the start of the buffer and copy */ > skip = rqstp->rq_arg.len - rqstp->rq_arg.head[0].iov_len; > memcpy(dr->args, rqstp->rq_arg.head[0].iov_base - skip, > dr->argslen << 2); > } > + WARN_ON_ONCE(rqstp->rq_xprt_ctxt != dr->xprt_ctxt); > + rqstp->rq_xprt_ctxt = NULL; > trace_svc_defer(rqstp); > svc_xprt_get(rqstp->rq_xprt); > dr->xprt = rqstp->rq_xprt; I think this looks right, assuming my understanding of what rq_deferred == NULL indicates. Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>