On Thu, Jun 06, 2024 at 09:30:51AM +0000, Konstantin Taranov wrote: > > > > > +static void > > > > > +mana_ib_event_handler(void *ctx, struct gdma_queue *q, struct > > > > > +gdma_event *event) { > > > > > + struct mana_ib_dev *mdev = (struct mana_ib_dev *)ctx; > > > > > + struct mana_ib_qp *qp; > > > > > + struct ib_event ev; > > > > > + unsigned long flag; > > > > > + u32 qpn; > > > > > + > > > > > + switch (event->type) { > > > > > + case GDMA_EQE_RNIC_QP_FATAL: > > > > > + qpn = event->details[0]; > > > > > + xa_lock_irqsave(&mdev->qp_table_rq, flag); > > > > > + qp = xa_load(&mdev->qp_table_rq, qpn); > > > > > + if (qp) > > > > > + refcount_inc(&qp->refcount); > > > > > + xa_unlock_irqrestore(&mdev->qp_table_rq, flag); > > > > > + if (!qp) > > > > > + break; > > > > > + if (qp->ibqp.event_handler) { > > > > > + ev.device = qp->ibqp.device; > > > > > + ev.element.qp = &qp->ibqp; > > > > > + ev.event = IB_EVENT_QP_FATAL; > > > > > + qp->ibqp.event_handler(&ev, qp->ibqp.qp_context); > > > > > + } > > > > > + if (refcount_dec_and_test(&qp->refcount)) > > > > > + complete(&qp->free); > > > > > + break; > > > > > + default: > > > > > + break; > > > > > + } > > > > > +} > > > > > > > > <...> > > > > > > > > > @@ -620,6 +626,11 @@ static int mana_ib_destroy_rc_qp(struct > > > > mana_ib_qp *qp, struct ib_udata *udata) > > > > > container_of(qp->ibqp.device, struct mana_ib_dev, ib_dev); > > > > > int i; > > > > > > > > > > + xa_erase_irq(&mdev->qp_table_rq, qp->ibqp.qp_num); > > > > > + if (refcount_dec_and_test(&qp->refcount)) > > > > > + complete(&qp->free); > > > > > + wait_for_completion(&qp->free); > > > > > > > > This flow is unclear to me. You are destroying the QP and need to > > > > make sure that mana_ib_event_handler is not running at that point > > > > and not mess with refcount and complete. > > > > > > Hi, Leon. Thanks for the concern. Here is the clarification: > > > The flow is similar to what mlx5 does with mlx5_get_rsc and > > mlx5_core_put_rsc. > > > When we get a QP resource, we increment the counter while holding the xa > > lock. > > > So, when we destroy a QP, the code first removes the QP from the xa to > > ensure that nobody can get it. > > > And then we check whether mana_ib_event_handler is holding it with > > refcount_dec_and_test. > > > If the QP is held, then we wait for the mana_ib_event_handler to call > > complete. > > > Once the wait returns, it is impossible to get the QP referenced, since it is > > not in the xa and all references have been removed. > > > So now we can release the QP in HW, so the QPN can be assigned to new > > QPs. > > > > > > Leon, have you spotted a bug? Or just wanted to understand the flow? > > > > I understand the "general" flow, but think that implementation is very > > convoluted here. In addition, mlx5 and other drivers make sure sure that HW > > object is not free before they free it. They don't "mess" with ibqp, and > > probably you should do the same. > > I can make the code more readable by introducing 4 helpers: add_qp_ref, get_qp_ref, put_qp_ref, destroy_qp_ref. > Would it make the code less convoluted for you? The thing is that you are trying to open-code part of restrack logic, which already has xarray DB per-QPN, maybe you should extend it to support your case, by adding some sort of barrier to prevent QP from being used. > > The devices are different. Mana can do EQE with QPN, which can be destroyed by OS. With that reference counting I make sure > we do not destroy QP which is used in EQE processing. We still destroy the HW resource at same time as before the change. > The xa table is just a lookup table, which says whether object can be referenced or not. The change just dictates that we first > make a QP not referenceable. > > Note, that if we remove the QP from the table after we destroy it in HW, we can have a bug due to the collision in the xa table when > at the same time another entity creates a QP. Since the QPN is released in HW, it will most likely given to the next create_qp (so mana, unlike mlx, > does not assign QPNs with an increment and gives recently used QPN). So, the create_qp can fail as the QPN is still used in the xa. > > And what do you mean that "don't "mess" with ibqp"? Are you saying that we cannot process QP-related interrupts? > What do you propose to change? In any case it will be the same logic: > 1) remove from table > 2) make sure that current IRQ does not hold a reference > I use counters for that as most of other IB providers. > > I would also appreciate a list of changes to make this patch approved or confirmation that no changes are required. I'm not asking to change anything at this point, just trying to see if there is more general way to solve this problem, which exists in all drivers now and in the future. Thanks > Thanks > > > Thanks > > > > > Thanks > > > > > > > > > > > Thanks