RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Steve Wise [mailto:swise@xxxxxxxxxxxxxxxxxxxxx]
> Sent: Thursday, July 03, 2014 1:27 AM
> To: Devesh Sharma; 'Chuck Lever'; linux-rdma@xxxxxxxxxxxxxxx; linux-
> nfs@xxxxxxxxxxxxxxx
> Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> disconnect
> 
> 
> 
> > -----Original Message-----
> > From: Devesh Sharma [mailto:Devesh.Sharma@xxxxxxxxxx]
> > Sent: Wednesday, July 02, 2014 2:54 PM
> > To: Steve Wise; 'Chuck Lever'; linux-rdma@xxxxxxxxxxxxxxx;
> > linux-nfs@xxxxxxxxxxxxxxx
> > Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > disconnect
> >
> >
> >
> > > -----Original Message-----
> > > From: Steve Wise [mailto:swise@xxxxxxxxxxxxxxxxxxxxx]
> > > Sent: Thursday, July 03, 2014 1:21 AM
> > > To: Devesh Sharma; 'Chuck Lever'; linux-rdma@xxxxxxxxxxxxxxx; linux-
> > > nfs@xxxxxxxxxxxxxxx
> > > Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > > disconnect
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Devesh Sharma [mailto:Devesh.Sharma@xxxxxxxxxx]
> > > > Sent: Wednesday, July 02, 2014 2:43 PM
> > > > To: Steve Wise; Chuck Lever; linux-rdma@xxxxxxxxxxxxxxx;
> > > > linux-nfs@xxxxxxxxxxxxxxx
> > > > Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on
> > > > transport disconnect
> > > >
> > > > > -----Original Message-----
> > > > > From: Steve Wise [mailto:swise@xxxxxxxxxxxxxxxxxxxxx]
> > > > > Sent: Thursday, July 03, 2014 12:59 AM
> > > > > To: Devesh Sharma; Chuck Lever; linux-rdma@xxxxxxxxxxxxxxx;
> > > > > linux- nfs@xxxxxxxxxxxxxxx
> > > > > Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on
> > > > > transport disconnect
> > > > >
> > > > > On 7/2/2014 2:06 PM, Devesh Sharma wrote:
> > > > > > This change is very much prone to generate poll_cq errors
> > > > > > because of un-cleaned completions which still point to the
> > > > > > non-existent QPs. On the new connection when these completions
> > > > > > are polled, the poll_cq will fail
> > > > > because old QP pointer is already NULL.
> > > > > > Did anyone hit this situation during their testing?
> > > > >
> > > > > Hey Devesh,
> > > > >
> > > > > iw_cxgb4 will silently toss CQEs if the QP is not active.
> > > >
> > > > Ya, just now checked that in mlx and cxgb4 driver code. On the
> > > > other hand ocrdma is asserting a BUG-ON for such CQEs causing system
> panic.
> > > > Out of curiosity I am asking, how this change is useful here, is
> > > > it reducing the re-connection time...Anyhow rpcrdma_clean_cq was
> > > > discarding the completions (flush/successful both)
> > > >
> > >
> > > Well, I don't think there is anything restricting an application from
> destroying
> > > the QP with pending CQEs on its CQs.   So it definitely shouldn't cause a
> > > BUG_ON() I think.   I'll have to read up in the Verbs specs if destroying a
> QP
> > > kills all the pending CQEs...
> >
> > Oh confusion...let me clarify: in ocrdma BUG ON is hit in poll_cq()
> > after re-connection happens and cq is polled again.
> > Now the first completion in CQ still points to old QP-ID for which
> > ocrdma does not have valid QP pointer.
> >
> 
> Right.  Which means it’s a stale CQE.  I don't think that should cause a
> BUG_ON.

Yes this surely needs a fix in ocrdma.

> 
> 
> > >
> > >
> > > > >
> > > > >
> > > > > >> -----Original Message-----
> > > > > >> From: linux-rdma-owner@xxxxxxxxxxxxxxx [mailto:linux-rdma-
> > > > > >> owner@xxxxxxxxxxxxxxx] On Behalf Of Chuck Lever
> > > > > >> Sent: Tuesday, June 24, 2014 4:10 AM
> > > > > >> To: linux-rdma@xxxxxxxxxxxxxxx; linux-nfs@xxxxxxxxxxxxxxx
> > > > > >> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on
> > > > > >> transport disconnect
> > > > > >>
> > > > > >> CQs are not destroyed until unmount. By draining CQs on
> > > > > >> transport disconnect, successful completions that can change
> > > > > >> the r.frmr.state field can be missed.
> > > >
> > > > Still those are missed isn’t it....Since those successful
> > > > completions will still be dropped after re- connection. Am I
> > > > missing something to understanding the motivation...
> > > >
> > > > > >>
> > > > > >> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > > > > >> ---
> > > > > >>   net/sunrpc/xprtrdma/verbs.c |    5 -----
> > > > > >>   1 file changed, 5 deletions(-)
> > > > > >>
> > > > > >> diff --git a/net/sunrpc/xprtrdma/verbs.c
> > > > > >> b/net/sunrpc/xprtrdma/verbs.c index 3c7f904..451e100 100644
> > > > > >> --- a/net/sunrpc/xprtrdma/verbs.c
> > > > > >> +++ b/net/sunrpc/xprtrdma/verbs.c
> > > > > >> @@ -873,9 +873,6 @@ retry:
> > > > > >>   			dprintk("RPC:       %s:
> > > rpcrdma_ep_disconnect"
> > > > > >>   				" status %i\n", __func__, rc);
> > > > > >>
> > > > > >> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);
> > > > > >> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);
> > > > > >> -
> > > > > >>   		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);
> > > > > >>   		id = rpcrdma_create_id(xprt, ia,
> > > > > >>   				(struct sockaddr *)&xprt-
> > > >rx_data.addr);
> > > > > @@ -985,8 +982,6 @@
> > > > > >> rpcrdma_ep_disconnect(struct rpcrdma_ep *ep, struct
> > > > > >> rpcrdma_ia
> > > > > >> *ia) {
> > > > > >>   	int rc;
> > > > > >>
> > > > > >> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);
> > > > > >> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);
> > > > > >>   	rc = rdma_disconnect(ia->ri_id);
> > > > > >>   	if (!rc) {
> > > > > >>   		/* returns without wait if not connected */
> > > > > >>
> > > > > >> --
> > > > > >> To unsubscribe from this list: send the line "unsubscribe linux-
> rdma"
> > > > > >> in the body of a message to majordomo@xxxxxxxxxxxxxxx More
> > > > > majordomo
> > > > > >> info at http://vger.kernel.org/majordomo-info.html
> > > > > > N     r  y   b X  ǧv ^ )޺{.n +    {   "  ^n r   z   h    &   G   h 
> > > > > > ( 階 ݢj"   m     z ޖ   f   h   ~ mml==
> > >
> 

��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux