On Mon, May 20, 2019 at 05:24:43PM +0300, Gal Pressman wrote: > On 20/05/2019 16:10, Leon Romanovsky wrote: > > On Mon, May 20, 2019 at 03:39:26PM +0300, Gal Pressman wrote: > >> On 20/05/2019 9:54, Leon Romanovsky wrote: > >>> From: Leon Romanovsky <leonro@xxxxxxxxxxxx> > >>> > >>> There are two possible execution contexts of destroy flows in EFA. > >>> One is normal flow where user explicitly asked for object release > >>> and another error unwinding. > >>> > >>> In normal scenario, RDMA/core will ensure that udata is supplied > >>> according to KABI contract, for now it means no udata at all. > >>> > >>> In unwind flow, the EFA driver will receive uncleared udata from > >>> numerous *_create_*() calls, but won't release those resources > >>> due to extra checks. > >> > >> Thanks for the fix Leon, a few questions: > >> > >> Some of the unwind flows pass NULL udata and others an uncleared udata (is it > >> really uncleared or is it actually the create udata?), what are we considering > >> as the expected behavior? Isn't passing an uncleared udata the bug here? > > > > It is a matter of unwind sequence, if IB/core did something after > > driver created some object, it will need to call to destroy of this > > object too. So I don't think that it is the bug. > > > > And yes, it is not applicable for all flows, the one which caused me to > > write this patch is failure in ib_uverbs_reg_mr(), which will call to > > ib_dereg_mr_user(mr, &attrs->driver_udata); > > > > and attrs->driver_udata is valid there. > > Right, but is it really valid? The udata in/out buffers in that case are > actually the create buffers and the driver has no way of telling. I think a > better approach is to clear the udata before calling the driver as done in > normal destroy flow. > > Also, create_qp flow for example calls ib_destroy_qp on failure which passes > NULL udata, the different flows are not consistent and I don't see a reason why > they shouldn't be? Sorry, but I didn't look deeply enough on QPs and MRs yet to make clear cut, but from what I have seen till now, the implementation looks like a disaster. > > > > >> > >> Also, if passing NULL udata is expected (why?) we have a bigger problem here as > >> existing code will cause NULL dereference. > > > > Not anymore, the destroy paths are not relying on udata now. > > This patch solves it, I'm thinking we need another patch for for-rc to prevent > panic. I don't think that it is worth of hassle. EFA just entered kernel, rdma-core is not merged yet and anyway no one can use EFA adapter in his home. Thanks > > > > >> > >>> > >>> Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> > >>> --- > >>> drivers/infiniband/hw/efa/efa_verbs.c | 24 ------------------------ > >>> 1 file changed, 24 deletions(-) > >>> > >>> diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c > >>> index 6d6886c9009f..4999a74cee24 100644 > >>> --- a/drivers/infiniband/hw/efa/efa_verbs.c > >>> +++ b/drivers/infiniband/hw/efa/efa_verbs.c > >>> @@ -436,12 +436,6 @@ void efa_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) > >>> struct efa_dev *dev = to_edev(ibpd->device); > >>> struct efa_pd *pd = to_epd(ibpd); > >>> > >>> - if (udata->inlen && > >>> - !ib_is_udata_cleared(udata, 0, udata->inlen)) { > >>> - ibdev_dbg(&dev->ibdev, "Incompatible ABI params\n"); > >>> - return; > >>> - } > >>> - > >>> ibdev_dbg(&dev->ibdev, "Dealloc pd[%d]\n", pd->pdn); > >>> efa_pd_dealloc(dev, pd->pdn); > >>> } > >>> @@ -459,12 +453,6 @@ int efa_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata) > >>> struct efa_qp *qp = to_eqp(ibqp); > >>> int err; > >>> > >>> - if (udata->inlen && > >>> - !ib_is_udata_cleared(udata, 0, udata->inlen)) { > >>> - ibdev_dbg(&dev->ibdev, "Incompatible ABI params\n"); > >>> - return -EINVAL; > >>> - } > >>> - > >>> ibdev_dbg(&dev->ibdev, "Destroy qp[%u]\n", ibqp->qp_num); > >>> err = efa_destroy_qp_handle(dev, qp->qp_handle); > >>> if (err) > >>> @@ -865,12 +853,6 @@ int efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) > >>> struct efa_cq *cq = to_ecq(ibcq); > >>> int err; > >>> > >>> - if (udata->inlen && > >>> - !ib_is_udata_cleared(udata, 0, udata->inlen)) { > >>> - ibdev_dbg(&dev->ibdev, "Incompatible ABI params\n"); > >>> - return -EINVAL; > >>> - } > >>> - > >>> ibdev_dbg(&dev->ibdev, > >>> "Destroy cq[%d] virt[0x%p] freed: size[%lu], dma[%pad]\n", > >>> cq->cq_idx, cq->cpu_addr, cq->size, &cq->dma_addr); > >>> @@ -1556,12 +1538,6 @@ int efa_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) > >>> struct efa_mr *mr = to_emr(ibmr); > >>> int err; > >>> > >>> - if (udata->inlen && > >>> - !ib_is_udata_cleared(udata, 0, udata->inlen)) { > >>> - ibdev_dbg(&dev->ibdev, "Incompatible ABI params\n"); > >>> - return -EINVAL; > >>> - } > >>> - > >>> ibdev_dbg(&dev->ibdev, "Deregister mr[%d]\n", ibmr->lkey); > >>> > >>> if (mr->umem) { > >>> -- > >>> 2.20.1 > >>>