Re: [for-next PATCH 2/4] RDMA/bnxt_re: Refactor rdev init/uninit path and simplifying rtnl_lock usage

Selvin Xavier <selvin.xavier@xxxxxxxxxxxx> · Fri, 7 Dec 2018 11:43:50 +0530

On Fri, Dec 7, 2018 at 12:27 AM Jason Gunthorpe <jgg@xxxxxxxxxxxx> wrote:
>
> On Thu, Dec 06, 2018 at 10:49:44AM -0800, Selvin Xavier wrote:
> > From: Somnath Kotur <somnath.kotur@xxxxxxxxxxxx>
> >
> >  - Add more flags for better granularity in rdev init/uninit path.
> >  - bnxt_re_ib_reg() handles two main functionalities - initializing
> >    the device and registering with the IB stack.  Split it into 2
> >    functions i.e. bnxt_re_dev_init() and bnxt_re_ib_init()  to account
> >    for the same thereby improve modularity. Do the same for
> >    bnxt_re_ib_unreg()i.e. split into two functions - bnxt_re_dev_uninit()
> >    and  bnxt_re_ib_uninit().
> >  - Since bnxt_re_dev_init() part of bnxt_re_ib_reg()
> >    takes rtnl_lock inside for the entire function, so it is better to
> >    invoke it from bnxt_netdev_event() itself where the rtnl_lock is held
> >    already for  NETDEV_EVENT.
>
> Drivers cannot hold rtnl_lock while calling ib_unregister_device() (or
> really any lock, unless done carefully). Looks like bnxt does this
> already and prehaps this patch makes it worse?
>
Not really. You are right that bnxt is already calling
ib_unregister_device from rtnl_lock.
Current code has three instances of calling ib_unregister_device from
rtnl_lock context,
We  are removing two of them in this patch.

Un-int routine is split into two. bnxt_re_ib_uninit and bnxt_re_remove_device .
bnxt_re_ib_uninit  calls ib_unregister_device and
bnxt_re_remove_device destroys the resources from HW.

In this patch, we call bnxt_re_ib_uninit outside rtnl_lock and
bnxt_re_remove_device with rtnl_lock
 (except in bnxt_re_shutdown which i will explain later).

In case of mod_exit, bnxt_re_ib_uninit  is invoked first outside the rtnl_lock.
After this, rtnl_lock is acquired and bnxt_re_remove_device is invoked.

In the netdev_event to unregister the device,  we handle the uninit in
two stages now

>>         case NETDEV_UNREGISTER:
>> @@ -1555,9 +1581,14 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier,
>>                  */
>>                 if (atomic_read(&rdev->sched_count) > 0)
>>                         goto exit;
>> -               bnxt_re_ib_unreg(rdev);
>> -               bnxt_re_remove_one(rdev);
>> -               bnxt_re_dev_unreg(rdev);
>> +               /* Schedule work for unregistering from IB stack */
>> +               if (test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags)) {
>> +                       sch_work = true;
>> +                       break;
>> +               }

if the IB device is registered, we schedule a work to invoke
ib_unregister_device and return notifier.
So bnxt_re_ib_uninit is invoked from the bnxt_re_task.

>> +
>> +               /* IB device is removed now. Destroy the device */
>> +               bnxt_re_remove_device(rdev);
>>                 break;

Since the netdev reference is not released by bnxt_re driver (ref is
released in bnxt_re_remove_device),
the network stack will invoke all notifiers again till all the
references are released.
During the second or later invocation of the netdev notifier, we
proceed with the bnxt_re_remove_device.

bnxt_re_shutdown is a call back from the L2 driver. L2 driver acquires
the rtnl_lock
and invokes this hook.  Current Interface doesn't allow us to return a
failure and allow L2
driver to call back again. We will come up with a solution for this
and will fix it in another patch.