Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jul 2, 2014, at 3:48 PM, Devesh Sharma <Devesh.Sharma@xxxxxxxxxx> wrote:

> 
> 
>> -----Original Message-----
>> From: Steve Wise [mailto:swise@xxxxxxxxxxxxxxxxxxxxx]
>> Sent: Thursday, July 03, 2014 1:16 AM
>> To: 'Chuck Lever'; Devesh Sharma
>> Cc: linux-rdma@xxxxxxxxxxxxxxx; 'Linux NFS Mailing List'
>> Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
>> disconnect
>> 
>> 
>> 
>>> -----Original Message-----
>>> From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx]
>>> Sent: Wednesday, July 02, 2014 2:40 PM
>>> To: Steve Wise; Devesh Sharma
>>> Cc: linux-rdma@xxxxxxxxxxxxxxx; Linux NFS Mailing List
>>> Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
>>> disconnect
>>> 
>>> 
>>> On Jul 2, 2014, at 3:28 PM, Steve Wise <swise@xxxxxxxxxxxxxxxxxxxxx>
>> wrote:
>>> 
>>>> On 7/2/2014 2:06 PM, Devesh Sharma wrote:
>>>>> This change is very much prone to generate poll_cq errors because
>>>>> of un-cleaned
>>> completions which still
>>>>> point to the non-existent QPs. On the new connection when these
>>>>> completions are polled,
>>> the poll_cq will
>>>>> fail because old QP pointer is already NULL.
>>>>> Did anyone hit this situation during their testing?
>>> 
>>> I tested this aggressively with a fault injector that triggers regular
>>> connection disruption.
>>> 
>>>> Hey Devesh,
>>>> 
>>>> iw_cxgb4 will silently toss CQEs if the QP is not active.
>>> 
>>> xprtrdma relies on getting a completion (either successful or in
>>> error) for every WR it has posted. The goal of this patch is to avoid
>>> throwing away queued completions after a transport disconnect so we
>>> don't lose track of FRMR rkey updates (FAST_REG_MR and LOCAL_INV
>>> completions) and we can capture all RPC replies posted before the
>> connection was lost.
>>> 
>>> Sounds like we also need to keep the QP around, even in error state,
>>> until all known WRs on that QP have completed?
>>> 
> 
> Why not poll and process every completion during rpcrdma_cq_cleanup()….

Yes, I have a patch in the next version of this series that does that.
It just calls rpcrdma_sendcq_upcall() from the connect worker. I will
squash that change into this patch.

Maybe it needs to invoke rpcrdma_recvcq_upcall() there as well.


> 
>> 
>> Perhaps.
>> 
>>> 
>>>> 
>>>> 
>>>>>> -----Original Message-----
>>>>>> From: linux-rdma-owner@xxxxxxxxxxxxxxx [mailto:linux-rdma-
>>>>>> owner@xxxxxxxxxxxxxxx] On Behalf Of Chuck Lever
>>>>>> Sent: Tuesday, June 24, 2014 4:10 AM
>>>>>> To: linux-rdma@xxxxxxxxxxxxxxx; linux-nfs@xxxxxxxxxxxxxxx
>>>>>> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
>>>>>> disconnect
>>>>>> 
>>>>>> CQs are not destroyed until unmount. By draining CQs on transport
>>>>>> disconnect, successful completions that can change the
>>>>>> r.frmr.state field can be missed.
>>>>>> 
>>>>>> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
>>>>>> ---
>>>>>> net/sunrpc/xprtrdma/verbs.c |    5 -----
>>>>>> 1 file changed, 5 deletions(-)
>>>>>> 
>>>>>> diff --git a/net/sunrpc/xprtrdma/verbs.c
>>>>>> b/net/sunrpc/xprtrdma/verbs.c index 3c7f904..451e100 100644
>>>>>> --- a/net/sunrpc/xprtrdma/verbs.c
>>>>>> +++ b/net/sunrpc/xprtrdma/verbs.c
>>>>>> @@ -873,9 +873,6 @@ retry:
>>>>>> 			dprintk("RPC:       %s: rpcrdma_ep_disconnect"
>>>>>> 				" status %i\n", __func__, rc);
>>>>>> 
>>>>>> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);
>>>>>> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);
>>>>>> -
>>>>>> 		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);
>>>>>> 		id = rpcrdma_create_id(xprt, ia,
>>>>>> 				(struct sockaddr *)&xprt->rx_data.addr);
>> @@ -985,8 +982,6 @@
>>>>>> rpcrdma_ep_disconnect(struct rpcrdma_ep *ep, struct rpcrdma_ia
>>>>>> *ia)  {
>>>>>> 	int rc;
>>>>>> 
>>>>>> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);
>>>>>> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);
>>>>>> 	rc = rdma_disconnect(ia->ri_id);
>>>>>> 	if (!rc) {
>>>>>> 		/* returns without wait if not connected */
>>>>>> 
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>> linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>> N     r  y   b X  ǧv ^ )޺{.n +    {   "  ^n r   z   h    &   G
>>>>> h ( 階
>>> ݢj"   m     z ޖ   f   h   ~ mml==
>>>> 
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
>>>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More
>> majordomo
>>>> info at  http://vger.kernel.org/majordomo-info.html
>>> 
>>> --
>>> Chuck Lever
>>> chuck[dot]lever[at]oracle[dot]com
>>> 
>>> 
>> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux