> > > > >> Jason, This feature took a step back, imo, by using > > > >> ib_device_get_by_index() in nldev_dellink() and passing the struct > > > >> ib_device ptr vs simply passing the ibdev name string to the driver's > > > >> dellink() function and letting the driver deal with it all. I'm > > > >> inclined to go back to that approach. I don't see things getting > > > >> cleaner and simpler on this current path. > > > > It doesn't solve anything to have the driver do the lookup, it still > > > > has to have proper locking and prevent the races. If it can do that, > > > > then it can do it with an already looked-up device just the same. > > > > > > It is easier by just passing the name to the driver because the driver > > > can easily determine if the device has been removed via a > > > driver-specific lock and lookup. Looking up the device via > > > ib_device_get_by_index() uses a core-private mechanism that isn't public > > > to the driver, so it makes life more difficult. I could export > > > ib_device_put() but that doesn't seem clean to me. > > > > > > > I'm still wondering if there is some core functions we could create to > > > > handle this common case for dellink drivers. > > > > > > Please noodle on it - I could use your help. Thanks... > > > > Maybe the simplest is to have dellink return 0 and then the core code > > directly does the unregister. Then the core code can handle the > > special locking with internal locks. > > > > To make this work we probably need to add a 'close' callback into > > ib_device so that the core can sequence destruction properly - ie > > everything that happens in the driver after unregister should be done > > in close. > > > > This is sort of what netdev does. > > This approach results in ib device unregister done by the driver for cases like > NETDEV_UNREGISTER and module unload, and ib_device unregister done by > the core for just DELLINK. I don't like that. > > Also, the RXE unregister logic would become more complicated because the > "last man out must call ib_device_unregister()" is now dependent on which > path is doing this final device remove... > > Here is a way to ensure the module doesn't unload. With this and my current rxe changes, I think the races are all handled. ... device = ib_device_get_by_index(index); if (!device) return -EINVAL; ops = device->link_ops; if (!ops) { ib_device_put(device); return -EINVAL; } /* * Deref the ib_device before deleting it. Otherwise we * deadlock unregistering the device. Hold a ref on the * underlying device to keep the memory around until we're * done. If the module owning the device is unloading, * then the driver will delete the device during unload. */ dev = get_device(&device->dev); owner = device->owner; removing = try_module_get(owner); ib_device_put(device); if (!removing) { ops->dellink(device); module_put(owner); } put_device(&device->dev); return err; ...