Re: [PATCH v3 1/2] RDMA/Core: add RDMA_NLDEV_CMD_NEWLINK/DELLINK support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 11/30/2018 10:27 AM, Jason Gunthorpe wrote:
> On Fri, Nov 30, 2018 at 09:40:52AM -0600, Steve Wise wrote:
>>
>> On 11/29/2018 10:54 PM, Jason Gunthorpe wrote:
>>> On Fri, Nov 30, 2018 at 01:29:22AM +0000, Ruhl, Michael J wrote:
>>>
>>>> Should the mutex be held while calling the callback?  If the newlink or dellink
>>>> callbacks need to sleep (probably dellink specifically), I think holding the mutex
>>>> might not be a good idea.
>>> It would be a bit better if this lock was a rwsem, but it must be held
>>> whenever ops is touched.. This is probably a moot point though when
>>> dellink goes through the ib_dev not the ops list..
>>>
>> I'm not sure what you mean, but my current code under test keeps the
>> rdma_link_ops pointer cached in the struct ib_device.  Then
>> nldev_dellink() looks like this:
>>
>> +       device = ib_device_get_by_index(index);
>> +       if (!device)
>> +               return -EINVAL;
>> +
>> +       mutex_lock(&link_ops_mutex);
>> +       /* Deref the device before deleting it.  Otherwise we
>> +        * deadlock unregistering the device.
>> +        */
>> +       ib_device_put(device);
>> +
>> +       if (device->link_ops)
>> +               err = device->link_ops->dellink(device);
> The driver must directly guarantee that the lifetime of everything
> under device, including link_ops, exists until ib_unregister_device()
> returns.
>
> For something like link_ops this happens without any extra effort.
>
> So the lock is not needed to keep link_ops alive.
>
>> I still think the mutex is needed...  Note I had to call ib_device_put()
>> -before- calling the driver dellink function or we deadlocked doing the
>> device unregister.
> Ohh... this is all racy in rxe, and rxe has locking bugs :(
>
> rxe calls ib_unregister_device from module unload, sysfs/netlink and
> netdev notifiers and does nothing to guarentee ib_unregsiter_device()
> is called exactly once.
>
> ib_device_put like this is also racy as there is nothing holding the
> kref in this case, this could get the kref before putting I suppose,
> but then the core netlink code will try and put an already put device
> and blamo... So I think you need to add some way to tell netlink that
> this entire command should use kref not put ?
>
> and I don't entirely know how to fix everything wrong with rxe, but
> broadly
> - List manipulation of rxe_dev_list must always hold the dev_list_lock,
>   at least rxe_notify doesn't
> - rxe_remove should do the list_del internally, and it should be
>   approximately:
>
>   spin_lock(&dev_list_lock); 
>   already_removed = list_empty(&rxe->list);
>   list_del_init(&rxe->list);
>   spin_unlock(&dev_list_lock);
>
>   if (!already_removed)
>       ibv_unregister_device(&rxe->ibdev);
>
> - Something needs to prevent module unload until all ib devices are
>   unregistered. If the notifier already has internal locking then it
>   probably makes sense to keep link_ops_mutex in the core code, but as 
>   a rwsem
>
> Jason

So perhaps nldev_dellink() should just lookup the ib_device by index
using ib_device_get_by_index(), note the device name (copy it locally)
and rdma_link_ops pointer, and call ib_device_put().  Then call the
driver's dellink() function, passing the device name.  Thus the driver
can coordinate the multiple ways the device could be unregistered.

An alternative that I discussed with Leon, was having nldev_dellink()
just traverse all the registered rdma_link_ops and calling each
dellink() function, passing the device name, until one succeeds or they
all fail.  Simple...

I'm going to avoid fixing already-existing RXE bugs as part of this
series if I can...

Steve.







[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux