On Wed, Feb 24, 2021 at 08:03:54PM +0000, Chuck Lever wrote: > > > > On Feb 24, 2021, at 3:02 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > > > On Wed, Feb 24, 2021 at 02:18:18PM +0000, Chuck Lever wrote: > >> > >> > >>> On Feb 22, 2021, at 6:36 PM, Timo Rothenpieler <timo@xxxxxxxxxxxxxxxx> wrote: > >>> > >>> This brings it in line with the regular tcp backchannel, which also has > >>> all those timeouts disabled. > >>> > >>> Prevents the backchannel from timing out, getting some async operations > >>> like server side copying getting stuck indefinitely on the client side. > >>> > >>> Signed-off-by: Timo Rothenpieler <timo@xxxxxxxxxxxxxxxx> > >> > >> Thanks for your patch! I've included it in the for-rc branch at > >> > >> git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git > > > > So, I'm sure this patch makes sense. > > > > But I'm also curious why it's not recovering. > > Agreed. This patch is not a substitute for proper callback channel recovery. > > > > What I think should happen: > > > > - clp->cl_cb_state should be set to NFSD4_CB_DOWN. > > I think it's set to FAULT. OK. The result should be similar in that case, but SEQUENCE gets the SEQ4_STATUS_BACKCHANNEL_FAULT flag set instead. --b. > > > > - This should cause the next SEQUENCE reply to have > > SEQ4_STATUS_CB_PATH_DOWN set. > > - That should poke the client to recover. (Maybe by sending a > > BIND_CONN_TO_SESSION call?) > > > > I'd be curious whether any of that's actually happening. > > > > --b. > > > >> > >> > >>> --- > >>> Did the same testing with this applied than before, and could not > >>> observe it getting stuck, same as with the previous patch, which I > >>> removed before testing this one. > >>> > >>> This obviously still does not fix the issue of it being seemingly unable > >>> to reestablish the disconnected backchannel. > >>> An event that disconnects the backchannel but leaves the main connection > >>> intact seems a pretty rare occurance though, outside of this issue. > >>> > >>> net/sunrpc/xprtrdma/svc_rdma_backchannel.c | 6 +++--- > >>> 1 file changed, 3 insertions(+), 3 deletions(-) > >>> > >>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c > >>> index 63f8be974df2..8186ab6f99f1 100644 > >>> --- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c > >>> +++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c > >>> @@ -252,9 +252,9 @@ xprt_setup_rdma_bc(struct xprt_create *args) > >>> xprt->timeout = &xprt_rdma_bc_timeout; > >>> xprt_set_bound(xprt); > >>> xprt_set_connected(xprt); > >>> - xprt->bind_timeout = RPCRDMA_BIND_TO; > >>> - xprt->reestablish_timeout = RPCRDMA_INIT_REEST_TO; > >>> - xprt->idle_timeout = RPCRDMA_IDLE_DISC_TO; > >>> + xprt->bind_timeout = 0; > >>> + xprt->reestablish_timeout = 0; > >>> + xprt->idle_timeout = 0; > >>> > >>> xprt->prot = XPRT_TRANSPORT_BC_RDMA; > >>> xprt->ops = &xprt_rdma_bc_procs; > >>> -- > >>> 2.25.1 > >>> > >> > >> -- > >> Chuck Lever > > -- > Chuck Lever > >