Re: [PATCH v1] SUNRPC: Ensure backchannel transports are marked connected

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Aug 19, 2021, at 9:01 AM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
> 
> On Thu, 2021-08-19 at 08:29 -0400, Chuck Lever wrote:
>> With NFSv4.1+ on RDMA, backchannel recovery appears not to work.
>> 
>> xprt_setup_xxx_bc() is invoked by the client's first CREATE_SESSION
>> operation, and it always marks the rpc_clnt's transport as
>> connected.
>> 
>> On a subsequent CREATE_SESSION, if rpc_create() is called and
>> xpt_bc_xprt is populated, it might not be connected (for instance,
>> if a backchannel fault has occurred). Ensure that code path returns
>> a connected xprt also.
>> 
>> Reported-by: Timo Rothenpieler <timo@xxxxxxxxxxxxxxxx>
>> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
>> ---
>>  net/sunrpc/clnt.c |    1 +
>>  1 file changed, 1 insertion(+)
>> 
>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>> index 8b4de70e8ead..570480a649a3 100644
>> --- a/net/sunrpc/clnt.c
>> +++ b/net/sunrpc/clnt.c
>> @@ -535,6 +535,7 @@ struct rpc_clnt *rpc_create(struct
>> rpc_create_args *args)
>>                 xprt = args->bc_xprt->xpt_bc_xprt;
>>                 if (xprt) {
>>                         xprt_get(xprt);
>> +                       xprt_set_connected(xprt);
>>                         return rpc_create_xprt(args, xprt);
>>                 }
>>         }
>> 
>> 
> 
> No. This is wrong. If the connection got disconnected, then the client
> needs to reconnect and build a new connection altogether. We can't just
> make pretend that the old connection still exists.

The patch description is not clear: the client has not disconnected.
The forward channel is functioning properly, and the server has set
SEQ4_STATUS_BACKCHANNEL_FAULT.

To recover, the client sends a DESTROY_SESSION / CREATE_SESSION pair
on the existing connection. On the server, setup_callback_client()
invokes rpc_create() again -- it's this step that is failing during
the second CREATE_SESSION on a connection because the old xprt
is returned but it's still marked disconnected.

An alternative would be to ensure that setup_callback_client()
always puts xpt_bc_xprt before it invokes rpc_create(). But it
looked to me like rpc_create() already has a bunch of logic to
deal with an existing xpt_bc_xprt.


--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux