Re: iWARP and soft-iWARP interop testing

Leon Romanovsky <leon@xxxxxxxxxx> · Wed, 8 May 2019 19:19:03 +0300

On Wed, May 08, 2019 at 03:15:59PM +0000, Bernard Metzler wrote:
> -----"Leon Romanovsky" <leon@xxxxxxxxxx> wrote: -----
>
> >To: "Jason Gunthorpe" <jgg@xxxxxxxx>
> >From: "Leon Romanovsky" <leon@xxxxxxxxxx>
> >Date: 05/08/2019 04:25PM
> >Cc: "Doug Ledford" <dledford@xxxxxxxxxx>, "linux-rdma"
> ><linux-rdma@xxxxxxxxxxxxxxx>, "Bernard Metzler" <BMT@xxxxxxxxxxxxxx>
> >Subject: Re: iWARP and soft-iWARP interop testing
> >
> >On Wed, May 08, 2019 at 11:18:41AM -0300, Jason Gunthorpe wrote:
> >> On Wed, May 08, 2019 at 05:06:44PM +0300, Leon Romanovsky wrote:
> >> > On Wed, May 08, 2019 at 10:30:28AM -0300, Jason Gunthorpe wrote:
> >> > > On Wed, May 08, 2019 at 09:26:00AM +0300, Leon Romanovsky
> >wrote:
> >> > > > On Tue, May 07, 2019 at 01:13:04PM -0300, Jason Gunthorpe
> >wrote:
> >> > > > > On Mon, May 06, 2019 at 04:38:27PM -0400, Doug Ledford
> >wrote:
> >> > > > > > So, Jason and I were discussing the soft-iWARP driver
> >submission, and he
> >> > > > > > thought it would be good to know if it even works with
> >the various iWARP
> >> > > > > > hardware devices.  I happen to have most of them on hand
> >in one form or
> >> > > > > > another, so I set down to test it.  In the process, I ran
> >across some
> >> > > > > > issues just with the hardware versions themselves, let
> >alone with soft-
> >> > > > > > iWARP.  So, here's the results of my matrix of tests.
> >These aren't
> >> > > > > > performance tests, just basic "does it work" smoke
> >tests...
> >> > > > >
> >> > > > > Well, lets imagine to merge this at 5.2-rc1?
> >> > > >
> >> > > > Can we do something with kref in QPs and MRs before merging
> >it?
> >> > > >
> >> > > > I'm super worried that memory model and locking used in this
> >driver
> >> > > > won't allow me to continue with allocation patches?
> >> > >
> >> > > Well, this use of idr doesn't look right to me:
> >> > >
> >> > > static inline struct siw_qp *siw_qp_id2obj(struct siw_device
> >*sdev, int id)
> >> > > {
> >> > > 	struct siw_qp *qp = idr_find(&sdev->qp_idr, id);
> >> > >
> >> > > 	if (likely(qp && kref_get_unless_zero(&qp->ref)))
> >> > > 		return qp;
> >> > >
> >> > > kref_get_unless_zero is nonsense unless used with someting like
> >rcu,
> >> > > and there is no rcu read lock here.
> >> > >
> >> > > Also, IDR's have to be locked..
> >> > >
> >> > > It probably wants to be written as
> >> > >
> >> > > xa_lock()
> >> > > qp = xa_load()
> >> > > if (qp)
> >> > >    kref_get(&qp->ref);
> >> > > xa_unlock()
> >> > >
> >> > > But I'm not completely sure what this is all about.. A QP
> >cannot
> >> > > really exist past destroy - about the only thing that would
> >make sense
> >> > > is to leave some memory around so other things can see it is
> >failed -
> >> > > but generally it is better to wipe out the QP from those other
> >things
> >> > > then attempt to do reference counting like this.
> >> >
> >> > No, no,, no, it is still not enough. I need to be sure that
> >destroy path
> >> > always successes and kref_get(&qp->ref) doesn't guarantee that.
> >> >
> >> > The good coding pattern can be seen in rdmavt
> >> >
> >https://urldefense.proofpoint.com/v2/url?u=https-3A__elixir.bootlin.c
> >om_linux_latest_source_drivers_infiniband_sw_rdmavt_cq.c-23L316&d=DwI
> >BAg&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3Q
> >CvqSc&m=W8K4_QR3oPmfyY52_46Q1ICeIMJr5MSNJIsPe9AgVBM&s=sWDpPSRP82Q7pD_
> >Q0fUJl44yuL42iMKHv0AKta4KGUo&e=
> >> > They krefing and releasing extra structure outside of user
> >visible object.
> >>
> >> In some respects I would rather the core code put a proper memory
> >kref
> >> in every object. We wanted this anyhow for the netlink restrack
> >> stuff, and used properly it is pretty useful.
> >
> >We can do it and for sure will do it, but in meanwhile I would prefer
> >do not
> >see additions of krefs in drivers.
> >
>
> Without questioning the concept here, moving allocation and freeing
> of core resources to the mid layer may induce complex changes at
> driver level. Especially for a SW driver, which references those
> objects on the fast path. Following the approach rdmavt takes,
> that results in a split of objects between driver and mid layer.
> So we abandon the idea of folding driver state and mid layer
> state into one object.
> I can think of such a thing for memory objects. What is the time
> line for those changes to the mid layer? Will it be part of 5.2?

I'm doing CQ now and it can be faster if I wouldn't need to fix driver
code in that area.

>
> I guarantee I'll do those changes when really needed, but I'd
> leave a rather stable current state. I am not saying I am reluctant
> to do so.

They are needed now, please ensure that all _destroy_ flows success
and don't return before they kfreed referenced object.

I don't want to see code like we have in nes:
destroy_cq()
{
   if (...)
      // memory leak
      return;

   kfree(cq);
}

Thanks

>
>
> Thanks,
> Bernard.
>