On Thu, Jul 22, 2021 at 09:06:07AM -0300, Jason Gunthorpe wrote: > On Thu, Jul 22, 2021 at 10:43:00AM +0300, Leon Romanovsky wrote: > > On Wed, Jul 21, 2021 at 04:51:55PM +0530, Dakshaja Uppalapati wrote: > > > Previous atomic increment decrement logic expects the atomic count > > > to be '0' after the final decrement. Replacing atomic count with > > > refcount does not allow that as refcount_dec() considers count of 1 as > > > underflow. Therefore fix the current refcount logic by decrementing > > > the refcount if one on the final deref in c4iw_destroy_cq(). > > > > > > Fixes: 7183451f846d (RDMA/cxgb4: Use refcount_t instead of atomic_t for reference counting") > > > Signed-off-by: Dakshaja Uppalapati <dakshaja@xxxxxxxxxxx> > > > Reviewed-by: Potnuri Bharat Teja <bharat@xxxxxxxxxxx> > > > drivers/infiniband/hw/cxgb4/cq.c | 3 +-- > > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > > > Thanks, > > Reviewed-by: Leon Romanovsky <leonro@xxxxxxxxxx> > > > > We have plenty of such errors, worth to check them: > > ➜ kernel git:(rdma-next) git grep refcount_read drivers/infiniband/ | grep -v WARN_ON > > drivers/infiniband/core/device.c: if (!refcount_read(&ib_dev->refcount)) > > drivers/infiniband/core/device.c: if (refcount_read(&device->refcount) == 0 || > > drivers/infiniband/core/iwpm_util.c: if (!refcount_read(&iwpm_admin.refcount)) { > > drivers/infiniband/core/iwpm_util.c: if (!refcount_read(&iwpm_admin.refcount)) { > > drivers/infiniband/core/ucma.c: if (refcount_read(&ctx->ref)) > > drivers/infiniband/hw/cxgb4/cq.c: wait_event(chp->wait, !refcount_read(&chp->refcnt)); > > drivers/infiniband/hw/irdma/utils.c: refcount_read(&cqp_request->refcnt) == 1, 1000); > > drivers/infiniband/hw/mlx5/mlx5_ib.h: wait_event(mmkey->wait, refcount_read(&mmkey->usecount) == 0); > > drivers/infiniband/hw/mlx5/mr.c: refcount_read(&mr->mmkey.usecount) != 0 && > > It isn't the read that is the problem, it is the naked dec. > > This common pattern is just being done in an obtuse and arguably wrong > way > > It is supposed to look like this: > > void ib_device_put(struct ib_device *device) > { > if (refcount_dec_and_test(&device->refcount)) > complete(&device->unreg_completion); > } > > [..] > > ib_device_put(device); > wait_for_completion(&device->unreg_completion); > > > Not use refcount_read() and a naked work queue > > So I'd say these ones should be looked at: > > drivers/infiniband/hw/cxgb4/cq.c: refcount_dec(&chp->refcnt); > drivers/infiniband/hw/hns/hns_roce_db.c: refcount_dec(&db->u.user_page->refcount); > drivers/infiniband/hw/irdma/cm.c: refcount_dec(&cm_node->refcnt); > drivers/infiniband/hw/irdma/cm.c: refcount_dec(&listener->refcnt); > drivers/infiniband/hw/irdma/cm.c: refcount_dec(&listener->refcnt); Jason, We are talking about two different issues that this refcount_read patch pointed. You are focused on wrong usage of completion, I saw useless compare of refcount_t with 0 that can't be. I prepared series that cleans iwpm from this type of error: drivers/infiniband/core/iwpm_util.c: if (!refcount_read(&iwpm_admin.refcount)) { drivers/infiniband/core/iwpm_util.c: if (!refcount_read(&iwpm_admin.refcount)) { Thanks > > Jason