On Wed, May 08, 2019 at 03:15:59PM +0000, Bernard Metzler wrote: > -----"Leon Romanovsky" <leon@xxxxxxxxxx> wrote: ----- > > >To: "Jason Gunthorpe" <jgg@xxxxxxxx> > >From: "Leon Romanovsky" <leon@xxxxxxxxxx> > >Date: 05/08/2019 04:25PM > >Cc: "Doug Ledford" <dledford@xxxxxxxxxx>, "linux-rdma" > ><linux-rdma@xxxxxxxxxxxxxxx>, "Bernard Metzler" <BMT@xxxxxxxxxxxxxx> > >Subject: Re: iWARP and soft-iWARP interop testing > > > >On Wed, May 08, 2019 at 11:18:41AM -0300, Jason Gunthorpe wrote: > >> On Wed, May 08, 2019 at 05:06:44PM +0300, Leon Romanovsky wrote: > >> > On Wed, May 08, 2019 at 10:30:28AM -0300, Jason Gunthorpe wrote: > >> > > On Wed, May 08, 2019 at 09:26:00AM +0300, Leon Romanovsky > >wrote: > >> > > > On Tue, May 07, 2019 at 01:13:04PM -0300, Jason Gunthorpe > >wrote: > >> > > > > On Mon, May 06, 2019 at 04:38:27PM -0400, Doug Ledford > >wrote: > >> > > > > > So, Jason and I were discussing the soft-iWARP driver > >submission, and he > >> > > > > > thought it would be good to know if it even works with > >the various iWARP > >> > > > > > hardware devices. I happen to have most of them on hand > >in one form or > >> > > > > > another, so I set down to test it. In the process, I ran > >across some > >> > > > > > issues just with the hardware versions themselves, let > >alone with soft- > >> > > > > > iWARP. So, here's the results of my matrix of tests. > >These aren't > >> > > > > > performance tests, just basic "does it work" smoke > >tests... > >> > > > > > >> > > > > Well, lets imagine to merge this at 5.2-rc1? > >> > > > > >> > > > Can we do something with kref in QPs and MRs before merging > >it? > >> > > > > >> > > > I'm super worried that memory model and locking used in this > >driver > >> > > > won't allow me to continue with allocation patches? > >> > > > >> > > Well, this use of idr doesn't look right to me: > >> > > > >> > > static inline struct siw_qp *siw_qp_id2obj(struct siw_device > >*sdev, int id) > >> > > { > >> > > struct siw_qp *qp = idr_find(&sdev->qp_idr, id); > >> > > > >> > > if (likely(qp && kref_get_unless_zero(&qp->ref))) > >> > > return qp; > >> > > > >> > > kref_get_unless_zero is nonsense unless used with someting like > >rcu, > >> > > and there is no rcu read lock here. > >> > > > >> > > Also, IDR's have to be locked.. > >> > > > >> > > It probably wants to be written as > >> > > > >> > > xa_lock() > >> > > qp = xa_load() > >> > > if (qp) > >> > > kref_get(&qp->ref); > >> > > xa_unlock() > >> > > > >> > > But I'm not completely sure what this is all about.. A QP > >cannot > >> > > really exist past destroy - about the only thing that would > >make sense > >> > > is to leave some memory around so other things can see it is > >failed - > >> > > but generally it is better to wipe out the QP from those other > >things > >> > > then attempt to do reference counting like this. > >> > > >> > No, no,, no, it is still not enough. I need to be sure that > >destroy path > >> > always successes and kref_get(&qp->ref) doesn't guarantee that. > >> > > >> > The good coding pattern can be seen in rdmavt > >> > > >https://urldefense.proofpoint.com/v2/url?u=https-3A__elixir.bootlin.c > >om_linux_latest_source_drivers_infiniband_sw_rdmavt_cq.c-23L316&d=DwI > >BAg&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3Q > >CvqSc&m=W8K4_QR3oPmfyY52_46Q1ICeIMJr5MSNJIsPe9AgVBM&s=sWDpPSRP82Q7pD_ > >Q0fUJl44yuL42iMKHv0AKta4KGUo&e= > >> > They krefing and releasing extra structure outside of user > >visible object. > >> > >> In some respects I would rather the core code put a proper memory > >kref > >> in every object. We wanted this anyhow for the netlink restrack > >> stuff, and used properly it is pretty useful. > > > >We can do it and for sure will do it, but in meanwhile I would prefer > >do not > >see additions of krefs in drivers. > > > > Without questioning the concept here, moving allocation and freeing > of core resources to the mid layer may induce complex changes at > driver level. Especially for a SW driver, which references those > objects on the fast path. Following the approach rdmavt takes, > that results in a split of objects between driver and mid layer. > So we abandon the idea of folding driver state and mid layer > state into one object. > I can think of such a thing for memory objects. What is the time > line for those changes to the mid layer? Will it be part of 5.2? I'm doing CQ now and it can be faster if I wouldn't need to fix driver code in that area. > > I guarantee I'll do those changes when really needed, but I'd > leave a rather stable current state. I am not saying I am reluctant > to do so. They are needed now, please ensure that all _destroy_ flows success and don't return before they kfreed referenced object. I don't want to see code like we have in nes: destroy_cq() { if (...) // memory leak return; kfree(cq); } Thanks > > > Thanks, > Bernard. >