On Wed, May 08, 2019 at 05:06:44PM +0300, Leon Romanovsky wrote: > On Wed, May 08, 2019 at 10:30:28AM -0300, Jason Gunthorpe wrote: > > On Wed, May 08, 2019 at 09:26:00AM +0300, Leon Romanovsky wrote: > > > On Tue, May 07, 2019 at 01:13:04PM -0300, Jason Gunthorpe wrote: > > > > On Mon, May 06, 2019 at 04:38:27PM -0400, Doug Ledford wrote: > > > > > So, Jason and I were discussing the soft-iWARP driver submission, and he > > > > > thought it would be good to know if it even works with the various iWARP > > > > > hardware devices. I happen to have most of them on hand in one form or > > > > > another, so I set down to test it. In the process, I ran across some > > > > > issues just with the hardware versions themselves, let alone with soft- > > > > > iWARP. So, here's the results of my matrix of tests. These aren't > > > > > performance tests, just basic "does it work" smoke tests... > > > > > > > > Well, lets imagine to merge this at 5.2-rc1? > > > > > > Can we do something with kref in QPs and MRs before merging it? > > > > > > I'm super worried that memory model and locking used in this driver > > > won't allow me to continue with allocation patches? > > > > Well, this use of idr doesn't look right to me: > > > > static inline struct siw_qp *siw_qp_id2obj(struct siw_device *sdev, int id) > > { > > struct siw_qp *qp = idr_find(&sdev->qp_idr, id); > > > > if (likely(qp && kref_get_unless_zero(&qp->ref))) > > return qp; > > > > kref_get_unless_zero is nonsense unless used with someting like rcu, > > and there is no rcu read lock here. > > > > Also, IDR's have to be locked.. > > > > It probably wants to be written as > > > > xa_lock() > > qp = xa_load() > > if (qp) > > kref_get(&qp->ref); > > xa_unlock() > > > > But I'm not completely sure what this is all about.. A QP cannot > > really exist past destroy - about the only thing that would make sense > > is to leave some memory around so other things can see it is failed - > > but generally it is better to wipe out the QP from those other things > > then attempt to do reference counting like this. > > No, no,, no, it is still not enough. I need to be sure that destroy path > always successes and kref_get(&qp->ref) doesn't guarantee that. > > The good coding pattern can be seen in rdmavt > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/sw/rdmavt/cq.c#L316 > They krefing and releasing extra structure outside of user visible object. In some respects I would rather the core code put a proper memory kref in every object. We wanted this anyhow for the netlink restrack stuff, and used properly it is pretty useful. Jason