Re: [PATCH RESEND for-next 5/9] IB/{rdmavt, qib, hfi1}: Use new routine to release reference counts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 15, 2019 at 08:52:52PM +0000, Ruhl, Michael J wrote:
> >-----Original Message-----
> >From: Leon Romanovsky [mailto:leon@xxxxxxxxxx]
> >Sent: Monday, April 15, 2019 9:13 AM
> >To: Dalessandro, Dennis <dennis.dalessandro@xxxxxxxxx>
> >Cc: jgg@xxxxxxxx; dledford@xxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx; Ruhl,
> >Michael J <michael.j.ruhl@xxxxxxxxx>; Marciniszyn, Mike
> ><mike.marciniszyn@xxxxxxxxx>
> >Subject: Re: [PATCH RESEND for-next 5/9] IB/{rdmavt, qib, hfi1}: Use new
> >routine to release reference counts
> >
> >On Mon, Apr 15, 2019 at 08:31:03AM -0400, Dennis Dalessandro wrote:
> >> On 4/12/2019 11:43 AM, Leon Romanovsky wrote:
> >> > On Fri, Apr 12, 2019 at 06:41:42AM -0700, Dennis Dalessandro wrote:
> >> > > From: Mike Marciniszyn <mike.marciniszyn@xxxxxxxxx>
> >> > >
> >> > > The reference count adjustments on reference count completion
> >> > > are open coded throughout.
> >> > >
> >> > > Add a routine to do all reference count adjustments and use.
> >> > >
> >> > > Reviewed-by: Michael J. Ruhl <michael.j.ruhl@xxxxxxxxx>
> >> > > Signed-off-by: Mike Marciniszyn <mike.marciniszyn@xxxxxxxxx>
> >> > > Signed-off-by: Dennis Dalessandro <dennis.dalessandro@xxxxxxxxx>
> >> > >
> >> > > ---
> >> > > For some reason patch didn't make the list or patchworks. This
> >> > > is just a re-send. All the rest seem to have made it.
> >> > > ---
> >> > >   drivers/infiniband/hw/hfi1/rc.c    |    4 ++--
> >> > >   drivers/infiniband/hw/qib/qib_rc.c |    4 ++--
> >> > >   drivers/infiniband/sw/rdmavt/qp.c  |    9 ++-------
> >> > >   include/rdma/rdmavt_qp.h           |   14 ++++++++++++++
> >> > >   4 files changed, 20 insertions(+), 11 deletions(-)
> >> > >
> >> > > diff --git a/drivers/infiniband/hw/hfi1/rc.c
> >b/drivers/infiniband/hw/hfi1/rc.c
> >> > > index 5ba39a9..a922edc 100644
> >> > > --- a/drivers/infiniband/hw/hfi1/rc.c
> >> > > +++ b/drivers/infiniband/hw/hfi1/rc.c
> >> > > @@ -1834,7 +1834,7 @@ void hfi1_rc_send_complete(struct rvt_qp
> >*qp, struct hfi1_opa_header *opah)
> >> > >   		qp->s_last = s_last;
> >> > >   		/* see post_send() */
> >> > >   		barrier();
> >> > > -		rvt_put_swqe(wqe);
> >> > > +		rvt_put_qp_swqe(qp, wqe);
> >> > >   		rvt_qp_swqe_complete(qp,
> >> > >   				     wqe,
> >> > >   				     ib_hfi1_wc_opcode[wqe->wr.opcode],
> >> > > @@ -1882,7 +1882,7 @@ struct rvt_swqe *do_rc_completion(struct
> >rvt_qp *qp,
> >> > >   		u32 s_last;
> >> > >
> >> > >   		trdma_clean_swqe(qp, wqe);
> >> > > -		rvt_put_swqe(wqe);
> >> > > +		rvt_put_qp_swqe(qp, wqe);
> >> > >   		rvt_qp_wqe_unreserve(qp, wqe);
> >> > >   		s_last = qp->s_last;
> >> > >   		trace_hfi1_qp_send_completion(qp, wqe, s_last);
> >> > > diff --git a/drivers/infiniband/hw/qib/qib_rc.c
> >b/drivers/infiniband/hw/qib/qib_rc.c
> >> > > index 50dd981..2ac4c67 100644
> >> > > --- a/drivers/infiniband/hw/qib/qib_rc.c
> >> > > +++ b/drivers/infiniband/hw/qib/qib_rc.c
> >> > > @@ -933,7 +933,7 @@ void qib_rc_send_complete(struct rvt_qp *qp,
> >struct ib_header *hdr)
> >> > >   		qp->s_last = s_last;
> >> > >   		/* see post_send() */
> >> > >   		barrier();
> >> > > -		rvt_put_swqe(wqe);
> >> > > +		rvt_put_qp_swqe(qp, wqe);
> >> > >   		rvt_qp_swqe_complete(qp,
> >> > >   				     wqe,
> >> > >   				     ib_qib_wc_opcode[wqe->wr.opcode],
> >> > > @@ -975,7 +975,7 @@ static inline void update_last_psn(struct rvt_qp
> >*qp, u32 psn)
> >> > >   	    qib_cmp24(qp->s_sending_psn, qp->s_sending_hpsn) > 0) {
> >> > >   		u32 s_last;
> >> > >
> >> > > -		rvt_put_swqe(wqe);
> >> > > +		rvt_put_qp_swqe(qp, wqe);
> >> > >   		s_last = qp->s_last;
> >> > >   		if (++s_last >= qp->s_size)
> >> > >   			s_last = 0;
> >> > > diff --git a/drivers/infiniband/sw/rdmavt/qp.c
> >b/drivers/infiniband/sw/rdmavt/qp.c
> >> > > index 2460303..31a2e65 100644
> >> > > --- a/drivers/infiniband/sw/rdmavt/qp.c
> >> > > +++ b/drivers/infiniband/sw/rdmavt/qp.c
> >> > > @@ -623,10 +623,7 @@ static void rvt_clear_mr_refs(struct rvt_qp *qp,
> >int clr_sends)
> >> > >   		while (qp->s_last != qp->s_head) {
> >> > >   			struct rvt_swqe *wqe = rvt_get_swqe_ptr(qp, qp-
> >>s_last);
> >> > >
> >> > > -			rvt_put_swqe(wqe);
> >> > > -			if (qp->allowed_ops == IB_OPCODE_UD)
> >> > > -				atomic_dec(&ibah_to_rvtah(
> >> > > -						wqe->ud_wr.ah)->refcount);
> >> > > +			rvt_put_qp_swqe(qp, wqe);
> >> > >   			if (++qp->s_last >= qp->s_size)
> >> > >   				qp->s_last = 0;
> >> > >   			smp_wmb(); /* see qp_set_savail */
> >> > > @@ -2683,9 +2680,7 @@ void rvt_send_complete(struct rvt_qp *qp,
> >struct rvt_swqe *wqe,
> >> > >   	qp->s_last = last;
> >> > >   	/* See post_send() */
> >> > >   	barrier();
> >> > > -	rvt_put_swqe(wqe);
> >> > > -	if (qp->allowed_ops == IB_OPCODE_UD)
> >> > > -		atomic_dec(&ibah_to_rvtah(wqe->ud_wr.ah)->refcount);
> >> > > +	rvt_put_qp_swqe(qp, wqe);
> >> > >
> >> > >   	rvt_qp_swqe_complete(qp,
> >> > >   			     wqe,
> >> > > diff --git a/include/rdma/rdmavt_qp.h b/include/rdma/rdmavt_qp.h
> >> > > index a00c46a..68e38c2 100644
> >> > > --- a/include/rdma/rdmavt_qp.h
> >> > > +++ b/include/rdma/rdmavt_qp.h
> >> > > @@ -723,6 +723,20 @@ static inline void rvt_mod_retry_timer(struct
> >rvt_qp *qp)
> >> > >   	return rvt_mod_retry_timer_ext(qp, 0);
> >> > >   }
> >> > >
> >> > > +/**
> >> > > + * rvt_put_qp_swqe - drop refs held by swqe
> >> > > + * @qp: the send qp
> >> > > + * @wqe: the send wqe
> >> > > + *
> >> > > + * This drops any references held by the swqe
> >> > > + */
> >> > > +static inline void rvt_put_qp_swqe(struct rvt_qp *qp, struct rvt_swqe
> >*wqe)
> >> > > +{
> >> > > +	rvt_put_swqe(wqe);
> >> > > +	if (qp->allowed_ops == IB_OPCODE_UD)
> >> > > +		atomic_dec(&ibah_to_rvtah(wqe->ud_wr.ah)->refcount);
> >> > > +}
> >> >
> >> > In the context of my allocation change patches, I wanted to ask on which
> >> > objects should driver perform refcounting?
> >>
> >> Are you asking if we really need to do our own refcounting in the driver
> >> since your patch series went in to for-next?
> >
> >For this AH code - yes, I don't think that refcounting is needed.
>
> Hi Leon,
>
> We do need the reference count because the AH is used by the asynchronous
> send engine.
>
> I am going to work up a patch to implement the SLEEPABLE path in our destroy
> to deal with this.

I see, but it is not SLEEPABLE/not SLEEPABLE paths, there is a need to ensure
that destroy AH is synchronous operation.

Do you have time expectation when will it be ready?

It seems like my commit d345691471b4 ("RDMA: Handle AH
allocations by IB/core") assumed that all releases are synchronous.

There is use-after-free bug now in your driver.

>
> >However my question is more general - I need to know in which objects
> >we can remove refcounting without any worries that I'll break some driver.
>
> That is tougher to answer, and will need some more research.
>
> Do you have a list of objects in mind to start with?

The most challengeable are MR and QPs, everything else can definitely be
transformed to be without refcounting.

>
> Thanks,
>
> Mike
>
>
> >Thanks
> >
> >>
> >> -Denny
> >>
> >>
> >>

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux