RE: [PATCH V1] NFS-RDMA: fix qp pointer validation checks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx]
> Sent: Thursday, April 10, 2014 5:56 AM
> To: Devesh Sharma
> Cc: Linux NFS Mailing List; linux-rdma@xxxxxxxxxxxxxxx; Trond Myklebust
> Subject: Re: [PATCH V1] NFS-RDMA: fix qp pointer validation checks
> 
> 
> On Apr 9, 2014, at 7:56 PM, Devesh Sharma <Devesh.Sharma@xxxxxxxxxx>
> wrote:
> 
> > Hi Chuk and Trond
> >
> > I will resend a v2 for this.
> > What if ib_post_send() fails with immidate error, I that case also
> DECR_CQCOUNT() will be called but no completion will be reported. Will that
> not cause any problems?
> 
> We should investigate whether an error return from ib_post_{send,recv}
> means there will be no completion. But I've never seen these verbs fail in
> practice, so I'm not in a hurry to make work for anyone! ;-)
Any verb can fail, may be due to system is under memory pressure?
> 
> However it seems to me the new (!ia->ri_id->qp) checks outside the connect
> logic are unnecessary.
> 
> Clearly, as you noticed, the ib_post_{send,recv} verbs do not check that their
> "qp" argument is NULL before dereferencing it.
> 
> But I don't understand how xprtrdma can post any operation if the transport
> isn't connected. In other words, how would it be possible to call
> rpcrdma_ep_post_recv() if the connect had failed and there was no QP?
> 
> If disconnect wipes ia->ri_id->qp while there are still operations in progress,
> that would be the real bug.
Yes!, But I have seen one more kernel oops where QP is destroyed and xprtrdma still try to post in LOCAL_INV
WR on a NULL QP pointer and hence system crashes. So, I think what you missioned is really happening. 
> 
> 
> > Also in rpcrdma_register_frmr_external() I am seeing DECT_CQCOUNT is
> > called twice First at line 1538 (unlikely however) and second at line 1562.
> Shouldn't  it be only at 1562?
> 
> if (seg1->mr_chunk.rl_mw->r.frmr.state == FRMR_IS_VALID) then
> rpcrdma_register_frmr_external() posts two Work Requests (LOCAL_INV
> then FAST_REG_MR) with one ib_post_send(). Thus it is correct to
> DECR_CQCOUNT twice in that case because each WR will trigger a separate
> completion event.
Oh! I missed that.
> 
> 
> > -----Original Message-----
> > From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx]
> > Sent: Thursday, April 10, 2014 1:57 AM
> > To: Devesh Sharma
> > Cc: Linux NFS Mailing List; linux-rdma@xxxxxxxxxxxxxxx; Trond
> > Myklebust
> > Subject: Re: [PATCH V1] NFS-RDMA: fix qp pointer validation checks
> >
> >
> > On Apr 9, 2014, at 4:22 PM, Trond Myklebust
> <trond.myklebust@xxxxxxxxxxxxxxx> wrote:
> >
> >> Hi Devesh,
> >>
> >> This looks a lot better. I still have a couple of small suggestions, though.
> >>
> >> On Apr 9, 2014, at 14:40, Devesh Sharma <devesh.sharma@xxxxxxxxxx>
> wrote:
> >>
> >>> If the rdma_create_qp fails to create qp due to device firmware
> >>> being in invalid state xprtrdma still tries to destroy the
> >>> non-existant qp and ends up in a NULL pointer reference crash.
> >>> Adding proper checks for vaidating QP pointer avoids this to happen.
> >>>
> >>> Signed-off-by: Devesh Sharma <devesh.sharma@xxxxxxxxxx>
> >>> ---
> >>> net/sunrpc/xprtrdma/verbs.c |   29 +++++++++++++++++++++++++----
> >>> 1 files changed, 25 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/net/sunrpc/xprtrdma/verbs.c
> >>> b/net/sunrpc/xprtrdma/verbs.c index 9372656..902ac78 100644
> >>> --- a/net/sunrpc/xprtrdma/verbs.c
> >>> +++ b/net/sunrpc/xprtrdma/verbs.c
> >>> @@ -831,10 +831,12 @@ rpcrdma_ep_connect(struct rpcrdma_ep *ep,
> struct rpcrdma_ia *ia)
> >>> 	if (ep->rep_connected != 0) {
> >>> 		struct rpcrdma_xprt *xprt;
> >>> retry:
> >>> -		rc = rpcrdma_ep_disconnect(ep, ia);
> >>> -		if (rc && rc != -ENOTCONN)
> >>> -			dprintk("RPC:       %s: rpcrdma_ep_disconnect"
> >>> +		if (ia->ri_id->qp) {
> >>> +			rc = rpcrdma_ep_disconnect(ep, ia);
> >>> +			if (rc && rc != -ENOTCONN)
> >>> +				dprintk("RPC:       %s:
> rpcrdma_ep_disconnect"
> >>> 				" status %i\n", __func__, rc);
> >>> +		}
> >>> 		rpcrdma_clean_cq(ep->rep_cq);
> >>>
> >>> 		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia); @@ -
> 859,7
> >>> +861,9 @@ retry:
> >>> 			goto out;
> >>> 		}
> >>> 		/* END TEMP */
> >>> -		rdma_destroy_qp(ia->ri_id);
> >>> +		if (ia->ri_id->qp) {
> >>> +			rdma_destroy_qp(ia->ri_id);
> >>> +		}
> >>
> >> Nit: No need for braces here.
> >>
> >>> 		rdma_destroy_id(ia->ri_id);
> >>> 		ia->ri_id = id;
> >>> 	}
> >>> @@ -1557,6 +1561,13 @@ rpcrdma_register_frmr_external(struct
> rpcrdma_mr_seg *seg,
> >>> 	frmr_wr.wr.fast_reg.rkey = seg1->mr_chunk.rl_mw->r.frmr.fr_mr-
> >rkey;
> >>> 	DECR_CQCOUNT(&r_xprt->rx_ep);
> >
> > I don't think you can DECR_CQCOUNT, then exit without posting the send.
> That will screw up the completion counter and result in a transport hang,
> won't it?
> >
> >>>
> >>> +	if (!ia->ri_is->qp) {
> >>> +		rc = -EINVAL;
> >>> +		while (i--)
> >>> +			rpcrdma_unmap_one(ia, --seg);
> >>> +		goto out;
> >>> +	}
> >>
> >> Instead of duplicating the rpcrdma_unmap_one() cleanup here, why not
> >> just do
> >>
> >> 	if (ia->ri_is->qp)
> >> 		rc = ib_post_send(...)
> >> 	else
> >> 		rc = -EINVAL;
> >>
> >> BTW: can we not simply test for ia->ri_is->qp before we even call
> rpcrdma_map_one() and hence bail out before we have to do any cleanup?
> >>
> >>> +
> >>> 	rc = ib_post_send(ia->ri_id->qp, post_wr, &bad_wr);
> >>>
> >>> 	if (rc) {
> >>> @@ -1571,6 +1582,7 @@ rpcrdma_register_frmr_external(struct
> rpcrdma_mr_seg *seg,
> >>> 		seg1->mr_len = len;
> >>> 	}
> >>> 	*nsegs = i;
> >>> +out:
> >>> 	return rc;
> >>> }
> >>>
> >>> @@ -1592,6 +1604,9 @@ rpcrdma_deregister_frmr_external(struct
> rpcrdma_mr_seg *seg,
> >>> 	invalidate_wr.ex.invalidate_rkey = seg1->mr_chunk.rl_mw-
> >r.frmr.fr_mr->rkey;
> >>> 	DECR_CQCOUNT(&r_xprt->rx_ep);
> >
> > Ditto.
> >
> >>>
> >>> +	if (!ia->ri_id->qp)
> >>> +		return -EINVAL;
> >>> +
> >>> 	rc = ib_post_send(ia->ri_id->qp, &invalidate_wr, &bad_wr);
> >>> 	if (rc)
> >>> 		dprintk("RPC:       %s: failed ib_post_send for invalidate,"
> >>> @@ -1923,6 +1938,9 @@ rpcrdma_ep_post(struct rpcrdma_ia *ia,
> >>> 		send_wr.send_flags = IB_SEND_SIGNALED;
> >>> 	}
> >
> > Ditto.
> >
> >>>
> >>> +	if (!ia->ri_id->qp)
> >>> +		return -EINVAL;
> >>> +
> >>> 	rc = ib_post_send(ia->ri_id->qp, &send_wr, &send_wr_fail);
> >>> 	if (rc)
> >>> 		dprintk("RPC:       %s: ib_post_send returned %i\n",
> __func__,
> >>> @@ -1951,6 +1969,9 @@ rpcrdma_ep_post_recv(struct rpcrdma_ia *ia,
> >>> 		rep->rr_iov.addr, rep->rr_iov.length, DMA_BIDIRECTIONAL);
> >>>
> >>> 	DECR_CQCOUNT(ep);
> >
> > And here.
> >
> >>> +
> >>> +	if (!ia->ri_id->qp)
> >>> +		return -EINVAL;
> >>> 	rc = ib_post_recv(ia->ri_id->qp, &recv_wr, &recv_wr_fail);
> >>>
> >>> 	if (rc)
> >>> --
> >>> 1.7.1
> >>>
> >>
> >> _________________________________
> >> Trond Myklebust
> >> Linux NFS client maintainer, PrimaryData
> >> trond.myklebust@xxxxxxxxxxxxxxx
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
> >> in the body of a message to majordomo@xxxxxxxxxxxxxxx More
> majordomo
> >> info at  http://vger.kernel.org/majordomo-info.html
> >
> > --
> > Chuck Lever
> > chuck[dot]lever[at]oracle[dot]com
> >
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> > in the body of a message to majordomo@xxxxxxxxxxxxxxx More
> majordomo
> > info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux