Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jul 2, 2014, at 3:28 PM, Steve Wise <swise@xxxxxxxxxxxxxxxxxxxxx> wrote:

> On 7/2/2014 2:06 PM, Devesh Sharma wrote:
>> This change is very much prone to generate poll_cq errors because of un-cleaned completions which still
>> point to the non-existent QPs. On the new connection when these completions are polled, the poll_cq will
>> fail because old QP pointer is already NULL.
>> Did anyone hit this situation during their testing?

I tested this aggressively with a fault injector that triggers regular connection
disruption.

> Hey Devesh,
> 
> iw_cxgb4 will silently toss CQEs if the QP is not active.

xprtrdma relies on getting a completion (either successful or in error) for every
WR it has posted. The goal of this patch is to avoid throwing away queued
completions after a transport disconnect so we don't lose track of FRMR rkey
updates (FAST_REG_MR and LOCAL_INV completions) and we can capture all RPC
replies posted before the connection was lost.

Sounds like we also need to keep the QP around, even in error state, until all
known WRs on that QP have completed?


> 
> 
>>> -----Original Message-----
>>> From: linux-rdma-owner@xxxxxxxxxxxxxxx [mailto:linux-rdma-
>>> owner@xxxxxxxxxxxxxxx] On Behalf Of Chuck Lever
>>> Sent: Tuesday, June 24, 2014 4:10 AM
>>> To: linux-rdma@xxxxxxxxxxxxxxx; linux-nfs@xxxxxxxxxxxxxxx
>>> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect
>>> 
>>> CQs are not destroyed until unmount. By draining CQs on transport
>>> disconnect, successful completions that can change the r.frmr.state field can
>>> be missed.
>>> 
>>> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
>>> ---
>>>  net/sunrpc/xprtrdma/verbs.c |    5 -----
>>>  1 file changed, 5 deletions(-)
>>> 
>>> diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
>>> index 3c7f904..451e100 100644
>>> --- a/net/sunrpc/xprtrdma/verbs.c
>>> +++ b/net/sunrpc/xprtrdma/verbs.c
>>> @@ -873,9 +873,6 @@ retry:
>>>  			dprintk("RPC:       %s: rpcrdma_ep_disconnect"
>>>  				" status %i\n", __func__, rc);
>>> 
>>> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);
>>> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);
>>> -
>>>  		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);
>>>  		id = rpcrdma_create_id(xprt, ia,
>>>  				(struct sockaddr *)&xprt->rx_data.addr);
>>> @@ -985,8 +982,6 @@ rpcrdma_ep_disconnect(struct rpcrdma_ep *ep,
>>> struct rpcrdma_ia *ia)  {
>>>  	int rc;
>>> 
>>> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);
>>> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);
>>>  	rc = rdma_disconnect(ia->ri_id);
>>>  	if (!rc) {
>>>  		/* returns without wait if not connected */
>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
>>> body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at
>>> http://vger.kernel.org/majordomo-info.html
>> N�����r��y���b�X��ǧv�^�)޺{.n�+����{���"��^n�r���z���h����&���G���h�(�階�ݢj"���m�����z�ޖ���f���h���~�mml==
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux