> On Feb 24, 2021, at 3:02 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > On Wed, Feb 24, 2021 at 02:18:18PM +0000, Chuck Lever wrote: >> >> >>> On Feb 22, 2021, at 6:36 PM, Timo Rothenpieler <timo@xxxxxxxxxxxxxxxx> wrote: >>> >>> This brings it in line with the regular tcp backchannel, which also has >>> all those timeouts disabled. >>> >>> Prevents the backchannel from timing out, getting some async operations >>> like server side copying getting stuck indefinitely on the client side. >>> >>> Signed-off-by: Timo Rothenpieler <timo@xxxxxxxxxxxxxxxx> >> >> Thanks for your patch! I've included it in the for-rc branch at >> >> git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git > > So, I'm sure this patch makes sense. > > But I'm also curious why it's not recovering. Agreed. This patch is not a substitute for proper callback channel recovery. > What I think should happen: > > - clp->cl_cb_state should be set to NFSD4_CB_DOWN. I think it's set to FAULT. > - This should cause the next SEQUENCE reply to have > SEQ4_STATUS_CB_PATH_DOWN set. > - That should poke the client to recover. (Maybe by sending a > BIND_CONN_TO_SESSION call?) > > I'd be curious whether any of that's actually happening. > > --b. > >> >> >>> --- >>> Did the same testing with this applied than before, and could not >>> observe it getting stuck, same as with the previous patch, which I >>> removed before testing this one. >>> >>> This obviously still does not fix the issue of it being seemingly unable >>> to reestablish the disconnected backchannel. >>> An event that disconnects the backchannel but leaves the main connection >>> intact seems a pretty rare occurance though, outside of this issue. >>> >>> net/sunrpc/xprtrdma/svc_rdma_backchannel.c | 6 +++--- >>> 1 file changed, 3 insertions(+), 3 deletions(-) >>> >>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c >>> index 63f8be974df2..8186ab6f99f1 100644 >>> --- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c >>> +++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c >>> @@ -252,9 +252,9 @@ xprt_setup_rdma_bc(struct xprt_create *args) >>> xprt->timeout = &xprt_rdma_bc_timeout; >>> xprt_set_bound(xprt); >>> xprt_set_connected(xprt); >>> - xprt->bind_timeout = RPCRDMA_BIND_TO; >>> - xprt->reestablish_timeout = RPCRDMA_INIT_REEST_TO; >>> - xprt->idle_timeout = RPCRDMA_IDLE_DISC_TO; >>> + xprt->bind_timeout = 0; >>> + xprt->reestablish_timeout = 0; >>> + xprt->idle_timeout = 0; >>> >>> xprt->prot = XPRT_TRANSPORT_BC_RDMA; >>> xprt->ops = &xprt_rdma_bc_procs; >>> -- >>> 2.25.1 >>> >> >> -- >> Chuck Lever -- Chuck Lever