> From: Jason Gunthorpe <jgg@xxxxxxxx> > Sent: Tuesday, August 07, 2018 5:28 PM > On Tue, Aug 07, 2018 at 10:02:03AM +0000, Kalderon, Michal wrote: > > > From: linux-rdma-owner@xxxxxxxxxxxxxxx <linux-rdma- > > > owner@xxxxxxxxxxxxxxx> On Behalf Of Jason Gunthorpe > > > > > > On Sun, Jul 29, 2018 at 01:16:10PM +0300, Yuval Bason wrote: > > > > > > > diff --git a/drivers/infiniband/hw/qedr/qedr_iw_cm.c > > > > b/drivers/infiniband/hw/qedr/qedr_iw_cm.c > > > > index 26dc374..505fa36 100644 > > > > +++ b/drivers/infiniband/hw/qedr/qedr_iw_cm.c > > > > @@ -491,7 +491,7 @@ int qedr_iw_connect(struct iw_cm_id *cm_id, > > > struct iw_cm_conn_param *conn_param) > > > > int rc = 0; > > > > int i; > > > > > > > > - qp = idr_find(&dev->qpidr, conn_param->qpn); > > > > + qp = idr_find(&dev->qpidr.idr, conn_param->qpn); > > > > > > > > laddr = (struct sockaddr_in *)&cm_id->m_local_addr; > > > > raddr = (struct sockaddr_in *)&cm_id->m_remote_addr; @@ > > > > -679,7 > > > > +679,7 @@ int qedr_iw_accept(struct iw_cm_id *cm_id, struct > > > > iw_cm_conn_param *conn_param) > > > > > > [..] > > > > > > > @@ -2279,8 +2275,9 @@ int qedr_destroy_qp(struct ib_qp *ibqp) > > > > > > > > qedr_free_qp_resources(dev, qp); > > > > > > > > - if (atomic_dec_and_test(&qp->refcnt)) { > > > > - qedr_idr_remove(dev, qp->qp_id); > > > > + if (atomic_dec_and_test(&qp->refcnt) && > > > > + rdma_protocol_iwarp(&dev->ibdev, 1)) { > > > > + qedr_idr_remove(dev, &dev->qpidr, qp->qp_id); > > > > kfree(qp); > > > > > > So what prevents kfree from racing with the idr_find above? Looks > > > like noting. > > > > Hi Jason, > > > This is the QP idr, and the use case here is for iWARP rdma connection > > flow. The flows here are part of the rdma-cm for iWARP and the races > > here are dealt with by locking and ref-counting in the rdma-cm. > > How? Specifically for the kfree race with idr_find the description below explains how it can't occur. But for the race of the idr being modified by adding/removing a different QP and performing idr_find, I agree, we need an rcu_read_lock around the idr_find. when registering an iWARP device, we provide an additional set of callbacks into our driver which are used by the iwcm: dev->ibdev.iwcm->connect = qedr_iw_connect; dev->ibdev.iwcm->accept = qedr_iw_accept; dev->ibdev.iwcm->reject = qedr_iw_reject; dev->ibdev.iwcm->create_listen = qedr_iw_create_listen; dev->ibdev.iwcm->destroy_listen = qedr_iw_destroy_listen; dev->ibdev.iwcm->add_ref = qedr_iw_qp_add_ref; dev->ibdev.iwcm->rem_ref = qedr_iw_qp_rem_ref; dev->ibdev.iwcm->get_qp = qedr_iw_get_qp; The idr_find above that you're referring to won't race with the kfree(qp) because qedr_idr_remove & kfree(qp) will be called only when the qp->refcnt reaches zero (atomic_dec_and_test will compare to zero ), the qp->refcnt is increased when add_ref is called, when rem_ref is called it is decreased ( and if it reaches zero memory is freed as well) Before qedr_iw_connect is called the rdma-cm calls add_ref ( snippet below) Copied from: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/drivers/infiniband/core/iwcm.c?h=v4.18-rc8 iw_cm_connect ... /* Get the ib_qp given the QPN */ qp = cm_id->device->iwcm->get_qp(cm_id->device, iw_param->qpn); if (!qp) { ret = -EINVAL; goto err; } cm_id->device->iwcm->add_ref(qp); -------> This is where refcnt is increased cm_id_priv->qp = qp; cm_id_priv->state = IW_CM_STATE_CONN_SENT; spin_unlock_irqrestore(&cm_id_priv->lock, flags); ret = iw_cm_map(cm_id, true); if (!ret) ret = cm_id->device->iwcm->connect(cm_id, iw_param); ------> This is where qedr_iw_connect is called, no risk of refcnt being zero if destroy_qp is called if (!ret) return 0; I'll send a separate bug fix for adding the rcu_read_lock around idr_find in qedr_iw_connect, qedr_iw_accept, qedr_iw_get_qp right after Yuval sends the SRQ v5 patch Thanks, Michal > > Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html