Re: [PATCH 4/4] RDMA/bnxt_re: synchronize the qp-handle table array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 13, 2024 at 10:00 AM Zhu Yanjun <yanjun.zhu@xxxxxxxxx> wrote:
>
> 在 2024/9/12 2:29, Selvin Xavier 写道:
> > There is a race between the CREQ tasklet and destroy
> > qp when accessing the qp-handle table. There is
> > a chance of reading a valid qp-handle in the
> > CREQ tasklet handler while the QP is already moving
> > ahead with the destruction.
> >
> > Fixing this race by implementing a table-lock to
> > synchronize the access.
> >
> > Fixes: f218d67ef004 ("RDMA/bnxt_re: Allow posting when QPs are in error")
> > Fixes: 84cf229f4001 ("RDMA/bnxt_re: Fix the qp table indexing")
> > Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@xxxxxxxxxxxx>
> > Signed-off-by: Selvin Xavier <selvin.xavier@xxxxxxxxxxxx>
> > ---
> >   drivers/infiniband/hw/bnxt_re/qplib_fp.c   |  5 +++++
> >   drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 12 ++++++++----
> >   drivers/infiniband/hw/bnxt_re/qplib_rcfw.h |  2 ++
> >   3 files changed, 15 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
> > index 42e98e5..5d36216 100644
> > --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
> > +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
> > @@ -1524,12 +1524,15 @@ int bnxt_qplib_destroy_qp(struct bnxt_qplib_res *res,
> >       struct creq_destroy_qp_resp resp = {};
> >       struct bnxt_qplib_cmdqmsg msg = {};
> >       struct cmdq_destroy_qp req = {};
> > +     unsigned long flags;
> >       u32 tbl_indx;
> >       int rc;
> >
> > +     spin_lock_irqsave(&rcfw->tbl_lock, flags);
>
> Because the race occurs between tasklets, spin_lock_bh is enough to this?
You are right. The race occurs in the BH handler.
bnxt_qplib_service_creq , which is scheduled in the tasklet context,
is already using the spin_lock_irqsave for synchronizing this function
with other contexts.  The current code change is with-in this context.
I added this change on top of the current code and used irqsave.

But  I will fix both the spin lock usage and post a v2.

Thanks for the review.

>
> Zhu Yanjun
>
> >       tbl_indx = map_qp_id_to_tbl_indx(qp->id, rcfw);
> >       rcfw->qp_tbl[tbl_indx].qp_id = BNXT_QPLIB_QP_ID_INVALID;
> >       rcfw->qp_tbl[tbl_indx].qp_handle = NULL;
> > +     spin_unlock_irqrestore(&rcfw->tbl_lock, flags);
> >
> >       bnxt_qplib_rcfw_cmd_prep((struct cmdq_base *)&req,
> >                                CMDQ_BASE_OPCODE_DESTROY_QP,
> > @@ -1540,8 +1543,10 @@ int bnxt_qplib_destroy_qp(struct bnxt_qplib_res *res,
> >                               sizeof(resp), 0);
> >       rc = bnxt_qplib_rcfw_send_message(rcfw, &msg);
> >       if (rc) {
> > +             spin_lock_irqsave(&rcfw->tbl_lock, flags);
> >               rcfw->qp_tbl[tbl_indx].qp_id = qp->id;
> >               rcfw->qp_tbl[tbl_indx].qp_handle = qp;
> > +             spin_unlock_irqrestore(&rcfw->tbl_lock, flags);
> >               return rc;
> >       }
> >
> > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
> > index 3ffaef0c..993c356 100644
> > --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
> > +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
> > @@ -637,17 +637,21 @@ static int bnxt_qplib_process_qp_event(struct bnxt_qplib_rcfw *rcfw,
> >       case CREQ_QP_EVENT_EVENT_QP_ERROR_NOTIFICATION:
> >               err_event = (struct creq_qp_error_notification *)qp_event;
> >               qp_id = le32_to_cpu(err_event->xid);
> > +             spin_lock(&rcfw->tbl_lock);
> >               tbl_indx = map_qp_id_to_tbl_indx(qp_id, rcfw);
> >               qp = rcfw->qp_tbl[tbl_indx].qp_handle;
> > +             if (!qp) {
> > +                     spin_unlock(&rcfw->tbl_lock);
> > +                     break;
> > +             }
> > +             bnxt_qplib_mark_qp_error(qp);
> > +             rc = rcfw->creq.aeq_handler(rcfw, qp_event, qp);
> > +             spin_unlock(&rcfw->tbl_lock);
> >               dev_dbg(&pdev->dev, "Received QP error notification\n");
> >               dev_dbg(&pdev->dev,
> >                       "qpid 0x%x, req_err=0x%x, resp_err=0x%x\n",
> >                       qp_id, err_event->req_err_state_reason,
> >                       err_event->res_err_state_reason);
> > -             if (!qp)
> > -                     break;
> > -             bnxt_qplib_mark_qp_error(qp);
> > -             rc = rcfw->creq.aeq_handler(rcfw, qp_event, qp);
> >               break;
> >       default:
> >               /*
> > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
> > index 45996e6..07779ae 100644
> > --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
> > +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
> > @@ -224,6 +224,8 @@ struct bnxt_qplib_rcfw {
> >       struct bnxt_qplib_crsqe         *crsqe_tbl;
> >       int qp_tbl_size;
> >       struct bnxt_qplib_qp_node *qp_tbl;
> > +     /* To synchronize the qp-handle hash table */
> > +     spinlock_t                      tbl_lock;
> >       u64 oos_prev;
> >       u32 init_oos_stats;
> >       u32 cmdq_depth;
>

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux