RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx]
> Sent: Thursday, July 03, 2014 1:30 AM
> To: Devesh Sharma
> Cc: Steve Wise; linux-rdma@xxxxxxxxxxxxxxx; Linux NFS Mailing List
> Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> disconnect
> 
> 
> On Jul 2, 2014, at 3:48 PM, Devesh Sharma <Devesh.Sharma@xxxxxxxxxx>
> wrote:
> 
> >
> >
> >> -----Original Message-----
> >> From: Steve Wise [mailto:swise@xxxxxxxxxxxxxxxxxxxxx]
> >> Sent: Thursday, July 03, 2014 1:16 AM
> >> To: 'Chuck Lever'; Devesh Sharma
> >> Cc: linux-rdma@xxxxxxxxxxxxxxx; 'Linux NFS Mailing List'
> >> Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> >> disconnect
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx]
> >>> Sent: Wednesday, July 02, 2014 2:40 PM
> >>> To: Steve Wise; Devesh Sharma
> >>> Cc: linux-rdma@xxxxxxxxxxxxxxx; Linux NFS Mailing List
> >>> Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> >>> disconnect
> >>>
> >>>
> >>> On Jul 2, 2014, at 3:28 PM, Steve Wise
> <swise@xxxxxxxxxxxxxxxxxxxxx>
> >> wrote:
> >>>
> >>>> On 7/2/2014 2:06 PM, Devesh Sharma wrote:
> >>>>> This change is very much prone to generate poll_cq errors because
> >>>>> of un-cleaned
> >>> completions which still
> >>>>> point to the non-existent QPs. On the new connection when these
> >>>>> completions are polled,
> >>> the poll_cq will
> >>>>> fail because old QP pointer is already NULL.
> >>>>> Did anyone hit this situation during their testing?
> >>>
> >>> I tested this aggressively with a fault injector that triggers
> >>> regular connection disruption.
> >>>
> >>>> Hey Devesh,
> >>>>
> >>>> iw_cxgb4 will silently toss CQEs if the QP is not active.
> >>>
> >>> xprtrdma relies on getting a completion (either successful or in
> >>> error) for every WR it has posted. The goal of this patch is to
> >>> avoid throwing away queued completions after a transport disconnect
> >>> so we don't lose track of FRMR rkey updates (FAST_REG_MR and
> >>> LOCAL_INV
> >>> completions) and we can capture all RPC replies posted before the
> >> connection was lost.
> >>>
> >>> Sounds like we also need to keep the QP around, even in error state,
> >>> until all known WRs on that QP have completed?
> >>>
> >
> > Why not poll and process every completion during
> rpcrdma_cq_cleanup()….
> 
> Yes, I have a patch in the next version of this series that does that.
> It just calls rpcrdma_sendcq_upcall() from the connect worker. I will squash
> that change into this patch.

Cool.

> 
> Maybe it needs to invoke rpcrdma_recvcq_upcall() there as well.

Yes, that would be needed.

> 
> 
> >
> >>
> >> Perhaps.
> >>
> >>>
> >>>>
> >>>>
> >>>>>> -----Original Message-----
> >>>>>> From: linux-rdma-owner@xxxxxxxxxxxxxxx [mailto:linux-rdma-
> >>>>>> owner@xxxxxxxxxxxxxxx] On Behalf Of Chuck Lever
> >>>>>> Sent: Tuesday, June 24, 2014 4:10 AM
> >>>>>> To: linux-rdma@xxxxxxxxxxxxxxx; linux-nfs@xxxxxxxxxxxxxxx
> >>>>>> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> >>>>>> disconnect
> >>>>>>
> >>>>>> CQs are not destroyed until unmount. By draining CQs on transport
> >>>>>> disconnect, successful completions that can change the
> >>>>>> r.frmr.state field can be missed.
> >>>>>>
> >>>>>> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> >>>>>> ---
> >>>>>> net/sunrpc/xprtrdma/verbs.c |    5 -----
> >>>>>> 1 file changed, 5 deletions(-)
> >>>>>>
> >>>>>> diff --git a/net/sunrpc/xprtrdma/verbs.c
> >>>>>> b/net/sunrpc/xprtrdma/verbs.c index 3c7f904..451e100 100644
> >>>>>> --- a/net/sunrpc/xprtrdma/verbs.c
> >>>>>> +++ b/net/sunrpc/xprtrdma/verbs.c
> >>>>>> @@ -873,9 +873,6 @@ retry:
> >>>>>> 			dprintk("RPC:       %s:
> rpcrdma_ep_disconnect"
> >>>>>> 				" status %i\n", __func__, rc);
> >>>>>>
> >>>>>> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);
> >>>>>> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);
> >>>>>> -
> >>>>>> 		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);
> >>>>>> 		id = rpcrdma_create_id(xprt, ia,
> >>>>>> 				(struct sockaddr *)&xprt-
> >rx_data.addr);
> >> @@ -985,8 +982,6 @@
> >>>>>> rpcrdma_ep_disconnect(struct rpcrdma_ep *ep, struct rpcrdma_ia
> >>>>>> *ia)  {
> >>>>>> 	int rc;
> >>>>>>
> >>>>>> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);
> >>>>>> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);
> >>>>>> 	rc = rdma_disconnect(ia->ri_id);
> >>>>>> 	if (!rc) {
> >>>>>> 		/* returns without wait if not connected */
> >>>>>>
> >>>>>> --
> >>>>>> To unsubscribe from this list: send the line "unsubscribe
> >>>>>> linux-rdma" in the body of a message to
> majordomo@xxxxxxxxxxxxxxx
> >>>>>> More majordomo info at http://vger.kernel.org/majordomo-
> info.html
> >>>>> N     r  y   b X  ǧv ^ )޺{.n +    {   "  ^n r   z   h    &   G
> >>>>> h ( 階
> >>> ݢj"   m     z ޖ   f   h   ~ mml==
> >>>>
> >>>> --
> >>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
> >>>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More
> >> majordomo
> >>>> info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >>> --
> >>> Chuck Lever
> >>> chuck[dot]lever[at]oracle[dot]com
> >>>
> >>>
> >>
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> > in the body of a message to majordomo@xxxxxxxxxxxxxxx More
> majordomo
> > info at  http://vger.kernel.org/majordomo-info.html
> >
> 
> --
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
> 
> 

��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux