Re: [PATCH] xprtrdma: Wake up re_connect_wait on disconnect

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dan-

> On Jun 20, 2020, at 2:46 PM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> 
> Hi Dan-
> 
>> On Jun 20, 2020, at 1:18 PM, Dan Aloni <dan@xxxxxxxxxxxx> wrote:
>> 
>> Given that rpcrdma_xprt_connect() happens from workqueue context, on cases where
>> connections don't succeeds, something needs to wake it up. In my case, this has
>> been observed when the CM callback received `RDMA_CM_EVENT_REJECTED`, and
>> `rpcrdma_xprt_connect()` slept forever.
> 
> Interesting. My development and testing generates plenty of REJECTED connection
> requests, but I never saw this particular failure mode.

Correction: My testing _used_ _to_ generate REJECTED events regularly. It does
not seem to any more, even after client crashes. So that explains why I haven't
seen this before.

I haven't reproduced the problem here, but the fix still looks proper to me,
and doesn't appear to introduce any regressions. I do have some issues with your
proposed patch, though.

The first paragraph of the patch description is incorrect. RDMA_CM_EVENT_DISCONNECTED
can occur only once a connection has been established. That guarantees there are no
waiters on re_connect_wait in that case. It's connect errors that need to wake-up
the connect worker.


>> This continues the fix in commit 58bd6656f808 ('xprtrdma: Restore wake-up-all to
>> rpcrdma_cm_event_handler()').

IMO this paragraph needs to be replaced by:

Fixes: e28ce90083f0 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")


>> Signed-off-by: Dan Aloni <dan@xxxxxxxxxxxx>
>> CC: Chuck Lever <chuck.lever@xxxxxxxxxx>
>> ---
>> 
>> Notes:
>>   Hi Chuck,
>> 
>>   Maybe I missd something, as it is not clear to me how otherwise (without this
>>   patch), re_connect_wait can be woken up in this situation. Please explain?
>> 
>> net/sunrpc/xprtrdma/verbs.c | 1 +
>> 1 file changed, 1 insertion(+)
>> 
>> diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
>> index 2ae348377806..8bd76a47a91f 100644
>> --- a/net/sunrpc/xprtrdma/verbs.c
>> +++ b/net/sunrpc/xprtrdma/verbs.c
>> @@ -289,6 +289,7 @@ rpcrdma_cm_event_handler(struct rdma_cm_id *id, struct rdma_cm_event *event)
>> 		ep->re_connect_status = -ECONNABORTED;
>> disconnected:
>> 		xprt_force_disconnect(xprt);
>> +		wake_up_all(&ep->re_connect_wait);
>> 		return rpcrdma_ep_destroy(ep);
>> 	default:
>> 		break;

This hunk does not apply on top of fixes I've already sent to Anna for 5.8-rc1.

So, if you don't object, I'll adjust your patch (this hunk and the description)
before sending it along to Anna.


--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux