On Fri, Nov 30, 2018 at 09:40:52AM -0600, Steve Wise wrote: > > > On 11/29/2018 10:54 PM, Jason Gunthorpe wrote: > > On Fri, Nov 30, 2018 at 01:29:22AM +0000, Ruhl, Michael J wrote: > > > >> Should the mutex be held while calling the callback? If the newlink or dellink > >> callbacks need to sleep (probably dellink specifically), I think holding the mutex > >> might not be a good idea. > > It would be a bit better if this lock was a rwsem, but it must be held > > whenever ops is touched.. This is probably a moot point though when > > dellink goes through the ib_dev not the ops list.. > > > I'm not sure what you mean, but my current code under test keeps the > rdma_link_ops pointer cached in the struct ib_device. Then > nldev_dellink() looks like this: > > + device = ib_device_get_by_index(index); > + if (!device) > + return -EINVAL; > + > + mutex_lock(&link_ops_mutex); > + /* Deref the device before deleting it. Otherwise we > + * deadlock unregistering the device. > + */ > + ib_device_put(device); > + > + if (device->link_ops) > + err = device->link_ops->dellink(device); The driver must directly guarantee that the lifetime of everything under device, including link_ops, exists until ib_unregister_device() returns. For something like link_ops this happens without any extra effort. So the lock is not needed to keep link_ops alive. > I still think the mutex is needed... Note I had to call ib_device_put() > -before- calling the driver dellink function or we deadlocked doing the > device unregister. Ohh... this is all racy in rxe, and rxe has locking bugs :( rxe calls ib_unregister_device from module unload, sysfs/netlink and netdev notifiers and does nothing to guarentee ib_unregsiter_device() is called exactly once. ib_device_put like this is also racy as there is nothing holding the kref in this case, this could get the kref before putting I suppose, but then the core netlink code will try and put an already put device and blamo... So I think you need to add some way to tell netlink that this entire command should use kref not put ? and I don't entirely know how to fix everything wrong with rxe, but broadly - List manipulation of rxe_dev_list must always hold the dev_list_lock, at least rxe_notify doesn't - rxe_remove should do the list_del internally, and it should be approximately: spin_lock(&dev_list_lock); already_removed = list_empty(&rxe->list); list_del_init(&rxe->list); spin_unlock(&dev_list_lock); if (!already_removed) ibv_unregister_device(&rxe->ibdev); - Something needs to prevent module unload until all ib devices are unregistered. If the notifier already has internal locking then it probably makes sense to keep link_ops_mutex in the core code, but as a rwsem Jason