This is a note to let you know that I've just added the patch titled xprtrdma: Fix rpcrdma_reqs_reset() to the 5.10-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: xprtrdma-fix-rpcrdma_reqs_reset.patch and it can be found in the queue-5.10 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit 864803acfc4b41c46c69540a53b1ca0a8dd4b3ac Author: Chuck Lever <chuck.lever@xxxxxxxxxx> Date: Tue Jun 4 15:45:23 2024 -0400 xprtrdma: Fix rpcrdma_reqs_reset() [ Upstream commit acd9f2dd23c632568156217aac7a05f5a0313152 ] Avoid FastReg operations getting MW_BIND_ERR after a reconnect. rpcrdma_reqs_reset() is called on transport tear-down to get each rpcrdma_req back into a clean state. MRs on req->rl_registered are waiting for a FastReg, are already registered, or are waiting for invalidation. If the transport is being torn down when reqs_reset() is called, the matching LocalInv might never be posted. That leaves these MR registered /and/ on req->rl_free_mrs, where they can be re-used for the next connection. Since xprtrdma does not keep specific track of the MR state, it's not possible to know what state these MRs are in, so the only safe thing to do is release them immediately. Fixes: 5de55ce951a1 ("xprtrdma: Release in-flight MRs on disconnect") Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> Signed-off-by: Anna Schumaker <Anna.Schumaker@xxxxxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c index 177099f6ae182..2035d748d923c 100644 --- a/net/sunrpc/xprtrdma/frwr_ops.c +++ b/net/sunrpc/xprtrdma/frwr_ops.c @@ -86,7 +86,8 @@ static void frwr_mr_recycle(struct rpcrdma_mr *mr) frwr_mr_release(mr); } -/* frwr_reset - Place MRs back on the free list +/** + * frwr_reset - Place MRs back on @req's free list * @req: request to reset * * Used after a failed marshal. For FRWR, this means the MRs diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c index 167f0d99b8580..9262c94a13c1d 100644 --- a/net/sunrpc/xprtrdma/verbs.c +++ b/net/sunrpc/xprtrdma/verbs.c @@ -932,6 +932,8 @@ static int rpcrdma_reqs_setup(struct rpcrdma_xprt *r_xprt) static void rpcrdma_req_reset(struct rpcrdma_req *req) { + struct rpcrdma_mr *mr; + /* Credits are valid for only one connection */ req->rl_slot.rq_cong = 0; @@ -941,7 +943,19 @@ static void rpcrdma_req_reset(struct rpcrdma_req *req) rpcrdma_regbuf_dma_unmap(req->rl_sendbuf); rpcrdma_regbuf_dma_unmap(req->rl_recvbuf); - frwr_reset(req); + /* The verbs consumer can't know the state of an MR on the + * req->rl_registered list unless a successful completion + * has occurred, so they cannot be re-used. + */ + while ((mr = rpcrdma_mr_pop(&req->rl_registered))) { + struct rpcrdma_buffer *buf = &mr->mr_xprt->rx_buf; + + spin_lock(&buf->rb_lock); + list_del(&mr->mr_all); + spin_unlock(&buf->rb_lock); + + frwr_mr_release(mr); + } } /* ASSUMPTION: the rb_allreqs list is stable for the duration,