Re: [PATCH v1] SUNRPC: Ensure backchannel transports are marked connected

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2021-08-19 at 13:16 +0000, Chuck Lever III wrote:
> 
> 
> > On Aug 19, 2021, at 9:01 AM, Trond Myklebust
> > <trondmy@xxxxxxxxxxxxxxx> wrote:
> > 
> > On Thu, 2021-08-19 at 08:29 -0400, Chuck Lever wrote:
> > > With NFSv4.1+ on RDMA, backchannel recovery appears not to work.
> > > 
> > > xprt_setup_xxx_bc() is invoked by the client's first
> > > CREATE_SESSION
> > > operation, and it always marks the rpc_clnt's transport as
> > > connected.
> > > 
> > > On a subsequent CREATE_SESSION, if rpc_create() is called and
> > > xpt_bc_xprt is populated, it might not be connected (for
> > > instance,
> > > if a backchannel fault has occurred). Ensure that code path
> > > returns
> > > a connected xprt also.
> > > 
> > > Reported-by: Timo Rothenpieler <timo@xxxxxxxxxxxxxxxx>
> > > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > > ---
> > >  net/sunrpc/clnt.c |    1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> > > index 8b4de70e8ead..570480a649a3 100644
> > > --- a/net/sunrpc/clnt.c
> > > +++ b/net/sunrpc/clnt.c
> > > @@ -535,6 +535,7 @@ struct rpc_clnt *rpc_create(struct
> > > rpc_create_args *args)
> > >                 xprt = args->bc_xprt->xpt_bc_xprt;
> > >                 if (xprt) {
> > >                         xprt_get(xprt);
> > > +                       xprt_set_connected(xprt);
> > >                         return rpc_create_xprt(args, xprt);
> > >                 }
> > >         }
> > > 
> > > 
> > 
> > No. This is wrong. If the connection got disconnected, then the
> > client
> > needs to reconnect and build a new connection altogether. We can't
> > just
> > make pretend that the old connection still exists.
> 
> The patch description is not clear: the client has not disconnected.
> The forward channel is functioning properly, and the server has set
> SEQ4_STATUS_BACKCHANNEL_FAULT.
> 
> To recover, the client sends a DESTROY_SESSION / CREATE_SESSION pair
> on the existing connection. On the server, setup_callback_client()
> invokes rpc_create() again -- it's this step that is failing during
> the second CREATE_SESSION on a connection because the old xprt
> is returned but it's still marked disconnected.
> 

How is that happening? As far as I can tell, the only thing that can
cause the backchannel to be marked as closed is a call to
svc_delete_xprt(), which explicitly calls xprt->xpt_bc_xprt->ops-
>close(xprt->xpt_bc_xprt).

The server shouldn't be reusing anything from that svc_xprt after that.

> An alternative would be to ensure that setup_callback_client()
> always puts xpt_bc_xprt before it invokes rpc_create(). But it
> looked to me like rpc_create() already has a bunch of logic to
> deal with an existing xpt_bc_xprt.
> 
> 

The connection status is managed by the transport layer, not the RPC
client layer.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux