> > + > > + switch (event->type) { > > + case GDMA_EQE_RNIC_QP_FATAL: > > + qpn = event->details[0]; > > + xa_lock_irqsave(&mdev->qp_table_rq, flag); > > + qp = xa_load(&mdev->qp_table_rq, qpn); > > + if (qp) > > + refcount_inc(&qp->refcount); > > > Move this to after checking for "if (!qp) break". Then we can have a race condition. Imagine that EQE came when we tried to destroy QP and the got the following execution order: EQE | QP destroy 1 xa_lock | 2 qp = xa_load | 3 xa_unlock | 4 | xa_erase_irq 5 | refcount_dec 6 | complete 7 refcount_inc | <-------- BUG > > > + xa_unlock_irqrestore(&mdev->qp_table_rq, flag); > > + if (!qp) > > + break; > > + if (qp->ibqp.event_handler) { > > + ev.device = qp->ibqp.device; > > + ev.element.qp = &qp->ibqp; > > + ev.event = IB_EVENT_QP_FATAL; > > + qp->ibqp.event_handler(&ev, qp->ibqp.qp_context); > > + } > > + if (refcount_dec_and_test(&qp->refcount)) > > + complete(&qp->free); > > + break; > Strange logic. Why not do: > if (!refcount_dec_and_test(&qp->refcount)) > wait_for_completion(&qp->free); > It might work, but the logic will be even stranger and it will prevent some debugging. With the proposed change, qp->free may not be completed even though the counter is 0. As a result, the change makes an incorrect state to be an expected state, thereby making bugs with that side effect undetectable. E.g., we have a bug "use after free" and then we try to trace whether qp was in use. Plus, it is a good practice deinit everything that was inited. With the proposed change it is violated.