On Tue, Jun 04, 2024 at 04:36:03AM -0700, Konstantin Taranov wrote: > From: Konstantin Taranov <kotaranov@xxxxxxxxxxxxx> > > Process QP fatal events from the error event queue. > For that, find the QP, using QPN from the event, and then call its > event_handler. To find the QPs, store created RC QPs in an xarray. > > Signed-off-by: Konstantin Taranov <kotaranov@xxxxxxxxxxxxx> > Reviewed-by: Wei Hu <weh@xxxxxxxxxxxxx> > --- > drivers/infiniband/hw/mana/device.c | 3 ++ > drivers/infiniband/hw/mana/main.c | 37 ++++++++++++++++++- > drivers/infiniband/hw/mana/mana_ib.h | 4 ++ > drivers/infiniband/hw/mana/qp.c | 11 ++++++ > .../net/ethernet/microsoft/mana/gdma_main.c | 1 + > include/net/mana/gdma.h | 1 + > 6 files changed, 55 insertions(+), 2 deletions(-) <...> > +static void > +mana_ib_event_handler(void *ctx, struct gdma_queue *q, struct gdma_event *event) > +{ > + struct mana_ib_dev *mdev = (struct mana_ib_dev *)ctx; > + struct mana_ib_qp *qp; > + struct ib_event ev; > + unsigned long flag; > + u32 qpn; > + > + switch (event->type) { > + case GDMA_EQE_RNIC_QP_FATAL: > + qpn = event->details[0]; > + xa_lock_irqsave(&mdev->qp_table_rq, flag); > + qp = xa_load(&mdev->qp_table_rq, qpn); > + if (qp) > + refcount_inc(&qp->refcount); > + xa_unlock_irqrestore(&mdev->qp_table_rq, flag); > + if (!qp) > + break; > + if (qp->ibqp.event_handler) { > + ev.device = qp->ibqp.device; > + ev.element.qp = &qp->ibqp; > + ev.event = IB_EVENT_QP_FATAL; > + qp->ibqp.event_handler(&ev, qp->ibqp.qp_context); > + } > + if (refcount_dec_and_test(&qp->refcount)) > + complete(&qp->free); > + break; > + default: > + break; > + } > +} <...> > @@ -620,6 +626,11 @@ static int mana_ib_destroy_rc_qp(struct mana_ib_qp *qp, struct ib_udata *udata) > container_of(qp->ibqp.device, struct mana_ib_dev, ib_dev); > int i; > > + xa_erase_irq(&mdev->qp_table_rq, qp->ibqp.qp_num); > + if (refcount_dec_and_test(&qp->refcount)) > + complete(&qp->free); > + wait_for_completion(&qp->free); This flow is unclear to me. You are destroying the QP and need to make sure that mana_ib_event_handler is not running at that point and not mess with refcount and complete. Thanks