On Fri, Dec 7, 2018 at 12:27 AM Jason Gunthorpe <jgg@xxxxxxxxxxxx> wrote: > > On Thu, Dec 06, 2018 at 10:49:44AM -0800, Selvin Xavier wrote: > > From: Somnath Kotur <somnath.kotur@xxxxxxxxxxxx> > > > > - Add more flags for better granularity in rdev init/uninit path. > > - bnxt_re_ib_reg() handles two main functionalities - initializing > > the device and registering with the IB stack. Split it into 2 > > functions i.e. bnxt_re_dev_init() and bnxt_re_ib_init() to account > > for the same thereby improve modularity. Do the same for > > bnxt_re_ib_unreg()i.e. split into two functions - bnxt_re_dev_uninit() > > and bnxt_re_ib_uninit(). > > - Since bnxt_re_dev_init() part of bnxt_re_ib_reg() > > takes rtnl_lock inside for the entire function, so it is better to > > invoke it from bnxt_netdev_event() itself where the rtnl_lock is held > > already for NETDEV_EVENT. > > Drivers cannot hold rtnl_lock while calling ib_unregister_device() (or > really any lock, unless done carefully). Looks like bnxt does this > already and prehaps this patch makes it worse? > Not really. You are right that bnxt is already calling ib_unregister_device from rtnl_lock. Current code has three instances of calling ib_unregister_device from rtnl_lock context, We are removing two of them in this patch. Un-int routine is split into two. bnxt_re_ib_uninit and bnxt_re_remove_device . bnxt_re_ib_uninit calls ib_unregister_device and bnxt_re_remove_device destroys the resources from HW. In this patch, we call bnxt_re_ib_uninit outside rtnl_lock and bnxt_re_remove_device with rtnl_lock (except in bnxt_re_shutdown which i will explain later). In case of mod_exit, bnxt_re_ib_uninit is invoked first outside the rtnl_lock. After this, rtnl_lock is acquired and bnxt_re_remove_device is invoked. In the netdev_event to unregister the device, we handle the uninit in two stages now >> case NETDEV_UNREGISTER: >> @@ -1555,9 +1581,14 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier, >> */ >> if (atomic_read(&rdev->sched_count) > 0) >> goto exit; >> - bnxt_re_ib_unreg(rdev); >> - bnxt_re_remove_one(rdev); >> - bnxt_re_dev_unreg(rdev); >> + /* Schedule work for unregistering from IB stack */ >> + if (test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags)) { >> + sch_work = true; >> + break; >> + } if the IB device is registered, we schedule a work to invoke ib_unregister_device and return notifier. So bnxt_re_ib_uninit is invoked from the bnxt_re_task. >> + >> + /* IB device is removed now. Destroy the device */ >> + bnxt_re_remove_device(rdev); >> break; Since the netdev reference is not released by bnxt_re driver (ref is released in bnxt_re_remove_device), the network stack will invoke all notifiers again till all the references are released. During the second or later invocation of the netdev notifier, we proceed with the bnxt_re_remove_device. bnxt_re_shutdown is a call back from the L2 driver. L2 driver acquires the rtnl_lock and invokes this hook. Current Interface doesn't allow us to return a failure and allow L2 driver to call back again. We will come up with a solution for this and will fix it in another patch.