Re: [PATCH for-rc 4/5] RDMA/bnxt_re: Fix error recovery sequence

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 10, 2024 at 5:18 PM Leon Romanovsky <leon@xxxxxxxxxx> wrote:
>
> On Mon, Dec 09, 2024 at 10:13:23AM +0530, Selvin Xavier wrote:
> > On Thu, Dec 5, 2024 at 5:08 PM Leon Romanovsky <leon@xxxxxxxxxx> wrote:
> > >
> > > On Thu, Dec 05, 2024 at 03:01:25PM +0530, Kalesh Anakkur Purayil wrote:
> > > > On Thu, Dec 5, 2024 at 2:37 PM Leon Romanovsky <leon@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Wed, Dec 04, 2024 at 01:24:15PM +0530, Kalesh AP wrote:
> > > > > > Fixed to return ENXIO from __send_message_basic_sanity()
> > > > > > to indicate that device is in error state. In the case of
> > > > > > ERR_DEVICE_DETACHED state, the driver should not post the
> > > > > > commands to the firmware as it will time out eventually.
> > > > > >
> > > > > > Removed bnxt_re_modify_qp() call from bnxt_re_dev_stop()
> > > > > > as it is a no-op.
> > > > > >
> > > > > > Fixes: cc5b9b48d447 ("RDMA/bnxt_re: Recover the device when FW error is detected")
> > > > > >
> > > > >
> > > > > Please don't add blank line here.
> > > > Sure, my bad. I missed it. Thanks for pointing it out.
> > > > >
> > > > > > Reviewed-by: Kashyap Desai <kashyap.desai@xxxxxxxxxxxx>
> > > > > > Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@xxxxxxxxxxxx>
> > > > > > Signed-off-by: Selvin Xavier <selvin.xavier@xxxxxxxxxxxx>
> > > > > > ---
> > > > > >  drivers/infiniband/hw/bnxt_re/main.c       | 8 +-------
> > > > > >  drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 7 ++++---
> > > > > >  drivers/infiniband/hw/bnxt_re/qplib_rcfw.h | 3 +++
> > > > > >  3 files changed, 8 insertions(+), 10 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
> > > > > > index b7af0d5ff3b6..c143f273b759 100644
> > > > > > --- a/drivers/infiniband/hw/bnxt_re/main.c
> > > > > > +++ b/drivers/infiniband/hw/bnxt_re/main.c
> > > > > > @@ -1715,11 +1715,8 @@ static bool bnxt_re_is_qp1_or_shadow_qp(struct bnxt_re_dev *rdev,
> > > > > >
> > > > > >  static void bnxt_re_dev_stop(struct bnxt_re_dev *rdev)
> > > > > >  {
> > > > > > -     int mask = IB_QP_STATE;
> > > > > > -     struct ib_qp_attr qp_attr;
> > > > > >       struct bnxt_re_qp *qp;
> > > > > >
> > > > > > -     qp_attr.qp_state = IB_QPS_ERR;
> > > > > >       mutex_lock(&rdev->qp_lock);
> > > > > >       list_for_each_entry(qp, &rdev->qp_list, list) {
> > > > > >               /* Modify the state of all QPs except QP1/Shadow QP */
> > > > > > @@ -1727,12 +1724,9 @@ static void bnxt_re_dev_stop(struct bnxt_re_dev *rdev)
> > > > > >                       if (qp->qplib_qp.state !=
> > > > > >                           CMDQ_MODIFY_QP_NEW_STATE_RESET &&
> > > > > >                           qp->qplib_qp.state !=
> > > > > > -                         CMDQ_MODIFY_QP_NEW_STATE_ERR) {
> > > > > > +                         CMDQ_MODIFY_QP_NEW_STATE_ERR)
> > > > > >                               bnxt_re_dispatch_event(&rdev->ibdev, &qp->ib_qp,
> > > > > >                                                      1, IB_EVENT_QP_FATAL);
> > > > > > -                             bnxt_re_modify_qp(&qp->ib_qp, &qp_attr, mask,
> > > > > > -                                               NULL);
> > > > > > -                     }
> > > > > >               }
> > > > > >       }
> > > > > >       mutex_unlock(&rdev->qp_lock);
> > > > > > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
> > > > > > index 5e90ea232de8..c8e65169f58a 100644
> > > > > > --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
> > > > > > +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
> > > > > > @@ -423,8 +423,9 @@ static int __send_message_basic_sanity(struct bnxt_qplib_rcfw *rcfw,
> > > > > >       cmdq = &rcfw->cmdq;
> > > > > >
> > > > > >       /* Prevent posting if f/w is not in a state to process */
> > > > > > -     if (test_bit(ERR_DEVICE_DETACHED, &rcfw->cmdq.flags))
> > > > > > -             return bnxt_qplib_map_rc(opcode);
> > > > > > +     if (RCFW_NO_FW_ACCESS(rcfw))
> > > > > > +             return -ENXIO;
> > > > > > +
> > > > > >       if (test_bit(FIRMWARE_STALL_DETECTED, &cmdq->flags))
> > > > > >               return -ETIMEDOUT;
> > > > > >
> > > > > > @@ -493,7 +494,7 @@ static int __bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
> > > > > >
> > > > > >       rc = __send_message_basic_sanity(rcfw, msg, opcode);
> > > > > >       if (rc)
> > > > > > -             return rc;
> > > > > > +             return rc == -ENXIO ? bnxt_qplib_map_rc(opcode) : rc;
> > > > > >
> > > > > >       rc = __send_message(rcfw, msg, opcode);
> > > > > >       if (rc)
> > > > > > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
> > > > > > index 88814cb3aa74..4f7d800e35c3 100644
> > > > > > --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
> > > > > > +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
> > > > > > @@ -129,6 +129,9 @@ static inline u32 bnxt_qplib_set_cmd_slots(struct cmdq_base *req)
> > > > > >
> > > > > >  #define RCFW_MAX_COOKIE_VALUE                (BNXT_QPLIB_CMDQE_MAX_CNT - 1)
> > > > > >  #define RCFW_CMD_IS_BLOCKING         0x8000
> > > > > > +#define RCFW_NO_FW_ACCESS(rcfw)                                              \
> > > > > > +     (test_bit(ERR_DEVICE_DETACHED, &(rcfw)->cmdq.flags) ||          \
> > > > > > +      pci_channel_offline((rcfw)->pdev))
> > > > >
> > > > > There is some disconnection between description and implementation.
> > > > > ERR_DEVICE_DETACHED is set when device is suspended, at this stage all
> > > > > FW commands should stop already and if they are not, bnxt_re has bugs
> > > > > in cleanup path. It should flush/cancel/e.t.c and not randomly test some
> > > > > bit.
> > > > Yes, the device is in reset state. All outstanding firmware commands
> > > > are suspended. We do not want to post any new commands to firmware in
> > > > the recovery teardown path. Any commands sent to firmware at this
> > > > point will time out.
> > > > To avoid that, before posting the command driver checks the state and
> > > > returns early.
> > >
> > > I understand that. Please reread my sentence "all FW commands should stop
> > > already and if they are not, bnxt_re has bugs in cleanup path.", and answer
> > > is how is it possible to get FW commands during restore.
> > Hi Leon,
> >
> > Not sure if I also got your point correctly. Once the error recovery
> > gets initiated, we
> > unregister the ib device in the suspend path. During the ib_unregister_device,
> > we get verb commands to destroy the QP, CQs etc. We want to prevent sending the
> > new commands to FW for all these operations. We also want to avoid sending
> > any resource creation commands from the stack while the device is
> > getting re-initialized
>
> The thing is that during ib_unregister_device nothing external to driver
> is going to be sent to FW.
ib_unregister will trigger the destroy_qp (at least QP1) and
destroy_cqs. Those verbs will be trying to
issue the FW command and we are trying to prevent sending it to FW here.
We need a check here in the common path to avoid sending any commands
and i dont see a way
to handle this otherwise. Current check has a bug where the return
code check was not correct and we
ended up sending some of the commands that eventually timeout.

Just to add, We have similar implementation in our L2 driver also,
which we were trying to replicate using
bnxt_re data structures.

#define BNXT_NO_FW_ACCESS(bp)                                   \
        (test_bit(BNXT_STATE_FW_FATAL_COND, &(bp)->state) ||    \
         pci_channel_offline((bp)->pdev))

Thanks,
Selvin
>
> > This is a common check that prevents more commands from the stack down
> > to Firmware.
> >
> > >
> > > > >
> > > > > In addition, pci_channel_offline() in driver which doesn't manage PCI
> > > > > device looks strange to me. It should be part of bnxt core and not
> > > > > related to IB.
> > > > The bnxt_re driver also has a firmware communication channel where it
> > > > writes to BAR to issue firmware commands. When the PCI channel is
> > > > offline, any commands issued from the driver will time out eventually.
> > > > To prevent that we added this extra check to detect that condition early.
> > >
> > > This micro optimization where you check in some random place for pci channel
> > > status is not correct.
> > Will remove pci_channel_offline check and come up with some other
> > mechanism to handle this case.
> > >
> > > Thanks
> > >
> > > >
> > > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > >  #define HWRM_VERSION_DEV_ATTR_MAX_DPI  0x1000A0000000DULL
> > > > > >  /* HWRM version 1.10.3.18 */
> > > > > > --
> > > > > > 2.31.1
> > > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > Kalesh A P
> > >
> > >
>
>

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux