On Fri, Oct 02, 2020 at 09:42:17AM -0300, Jason Gunthorpe wrote: > On Sat, Sep 26, 2020 at 01:19:35PM +0300, Leon Romanovsky wrote: > > diff --git a/drivers/infiniband/core/cq.c b/drivers/infiniband/core/cq.c > > index 12ebacf52958..1abcb01d362f 100644 > > +++ b/drivers/infiniband/core/cq.c > > @@ -267,10 +267,25 @@ struct ib_cq *__ib_alloc_cq(struct ib_device *dev, void *private, int nr_cqe, > > goto out_destroy_cq; > > } > > > > - rdma_restrack_add(&cq->res); > > + ret = rdma_restrack_add(&cq->res); > > + if (ret) > > + goto out_poll_cq; > > + > > trace_cq_alloc(cq, nr_cqe, comp_vector, poll_ctx); > > return cq; > > > > +out_poll_cq: > > + switch (cq->poll_ctx) { > > + case IB_POLL_SOFTIRQ: > > + irq_poll_disable(&cq->iop); > > + break; > > + case IB_POLL_WORKQUEUE: > > + case IB_POLL_UNBOUND_WORKQUEUE: > > + cancel_work_sync(&cq->work); > > This error unwind is *technically* in the wrong order, it is wrong in > ib_free_cq too which is an actual bug. > > The cq->comp_handler should be set before calling create_cq and undone > after calling destroy_wq. We can do this right now that the > allocations have been reworked. > > Otherwise there is no assurance the ib_cq_completion_workqueue() won't > be called after this cancel == use after free > > Also, you need to check all the rdma_restrack_del()'s, they should > always be *before* destroying the HW object, eg ib_free_cq() has it > too late. Similarly the add should always be after the HW object is > allocated. It is true to not converted object (QP and MR), everything that was converted has two steps: rdma_restrack_put() before creation, rdma_restrack_add() right after creation and rdma_restrack_del() after successful destroy. > > For instance fill_res_cq_entry() calls > > dev->ops.fill_res_cq_entry(msg, cq) > > on an already free'd HW object with this arrangment. > > These are pre-existing things so lets fix them seperately please I'll fix later. > > Jason