On Tue, Nov 12, 2024 at 02:55:12PM +0530, Selvin Xavier wrote: > +Michael Chan > > On Tue, Nov 12, 2024 at 1:47 PM Leon Romanovsky <leon@xxxxxxxxxx> wrote: > > > > On Fri, Nov 08, 2024 at 12:42:39AM -0800, Selvin Xavier wrote: > > > From: Kalesh AP <kalesh-anakkur.purayil@xxxxxxxxxxxx> > > > > > > Move the logic to setup and enable NQs to a new function. > > > Similarly moved the NQ cleanup logic to a common function. > > > Introdued a flag to keep track of NQ allocation status > > > and added sanity checks inside bnxt_re_stop_irq() and > > > bnxt_re_start_irq() to avoid possible race conditions. > > > > > > Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@xxxxxxxxxxxx> > > > Signed-off-by: Selvin Xavier <selvin.xavier@xxxxxxxxxxxx> > > > --- > > > drivers/infiniband/hw/bnxt_re/bnxt_re.h | 2 + > > > drivers/infiniband/hw/bnxt_re/main.c | 204 +++++++++++++++++++------------- > > > 2 files changed, 123 insertions(+), 83 deletions(-) > > > > <...> > > > > > > > > + rtnl_lock(); > > > + if (test_and_clear_bit(BNXT_RE_FLAG_SETUP_NQ, &rdev->flags)) > > > + bnxt_re_clean_nqs(rdev); > > > + rtnl_unlock(); > > > > <...> > > > > > + rtnl_lock(); > > > bnxt_qplib_free_ctx(&rdev->qplib_res, &rdev->qplib_ctx); > > > bnxt_qplib_disable_rcfw_channel(&rdev->rcfw); > > > type = bnxt_qplib_get_ring_type(rdev->chip_ctx); > > > bnxt_re_net_ring_free(rdev, rdev->rcfw.creq.ring_id, type); > > > + rtnl_unlock(); > > > > Please don't add rtnl_lock() to drivers in RDMA subsystem. BNXT driver > > is managed through netdev and it is there all proper locking should be > > done. > The main reason for bnxt_re to take the rtnl is because of the MSIx > resource configuration. > This is because the NIC driver is dynamically modifying the MSIx table > when the number > of ring change or ndo->open/close is invoked. So we stop and restart > the interrupts of RoCE also with rtnl held. rtnl_lock is a big kernel lock, which blocks almost everything in the netdev. In your case, you are changing one device configuration and should use your per-device locks. Even in the system with more than one BNXT device, the MSI-X on one device will influence other "innocent" devices. > > > > Please work to remove existing rtnl_lock() from bnxt_re_update_en_info_rdev() too. > > IMHO that lock is not needed after your driver conversion to auxbus. > This check is also to synchronize between the irq_stop and restart > implementation between > bnxt_en and bnxt_re driver and roce driver unload. > > We will review this locking and see if we can handle it. But it is a > major design change in both L2 > and roce drivers. You are adding new rtnl_lock and not moving it from one place to another, so this redesign should be done together with this new feature. Thanks > > > > Thanks