On Sat, 14 Sep 2013, Bjorn Helgaas wrote: > This gets to the reason I'm interested -- we have PCI device removal > issues, and I have been thinking of them largely as reference counting > problems. I'd like to make the assertion that if you have a pci_bus > or a pci_dev pointer, e.g., acquired by traversing a list similar to > usb_bus_list, you can assume the whole structure is valid, including > things it contains or points to. To make that work, you would have to hold the list's lock whenever you access the pointers contained in any structure on the list. A more common pattern is to lock the structure when you encounter it, and then assume the whole structure (including its embedded pointers) remains valid until it is unlocked. With USB root hubs, it's a little trickier for two reasons. First, the bus's root_hub pointer has to be set before the root hub is registered, and has to remain set after the root hub is unregistered. That's why the rh_registered flag is needed. Second, the usb_bus and usb_hcd structures don't contain a suitable mutex, so you can't lock them! (To be fair, the hcd_root_hub_lock -- and a couple of others -- could easily be embedded in the usb_hcd structure, rather than having a single global lock for all the buses. But there isn't much contention and it didn't seem worth the effort. Besides, hcd_root_hub_lock is a spinlock, whereas you would need a mutex.) > But here you have the additional rh_registered flag, which basically > tells you whether bus->root_hub is valid. My naive question is why > you don't wait to do the usb_device_put(hcd->self.root_hub) until > after you remove the bus from usb_bus_list. If you could do that, > usb_device_read() would still need the lock, like all other readers of > usb_bus_list, but it wouldn't need to look at rh_registered. Partly this is for symmetry. The root hub is created after the bus, therefore it is removed before the bus. Also, I'm not sure whether other issues would crop up if the removal were in the opposite order. Finally, usb_device_read() really does want to avoid looking at unregistered root hubs, not just deallocated root hubs. The analogous problem doesn't arise for non-root hubs. When a non-root device is registered or unregistered, the parent hub is locked the entire time. We can't do this for root hubs because they don't have parents (or at least, their parents aren't USB devices). So usb_bus_list_lock acts as a proxy for locking the parent -- but it's not exactly the same, because we don't hold the lock across the entire unregistration/removal procedures. (That fact probably has historical roots.) In fact, one way to solve this would be to insist that the root hub's parent device (which is a host controller) must be locked the entire time. This already is true during usb_add_hcd() and usb_remove_hcd(), because those routines are called from the controller driver's probe and remove routines. The problem is that usb_device_read() would cause an ABBA locking violation: The controller driver's probe routine runs with the controller locked, and it calls usb_add_hcd() which locks the bus list; usb_device_read() keeps the bus list locked and then would have to lock each controller as it is encountered. > > The fact that this "somewhere" surrounds a call to usb_disconnect() is > > more or less irrelevant. It may have been more important in earlier > > kernel versions (i.e., before hcd->rh_registered was invented), I don't > > remember. In theory, it would now be okay to move the usb_disconnect() > > call after the locked region, but it would look kind of strange to > > acquire the mutex and immediately release it without doing anything in > > between. I guess the next best thing would be to hold > > usb_bus_list_lock instead across these lines: > > > > spin_lock_irq (&hcd_root_hub_lock); > > hcd->rh_registered = 0; > > spin_unlock_irq (&hcd_root_hub_lock); > > > > That at least would make sense, and it would point out that > > hcd->rh_registered really is protected by both the mutex and the > > spinlock. > > That would make a lot more sense to me. The current code gives a > strong hint that the lock is related to something done in > usb_disconnect(), but that's really a pointer in the wrong direction. If you would like to make this change, it would be okay with me. Or maybe even better, move just the mutex_lock() call and leave the mutex_unlock() where it is. Then both the flag change and the unregistration would be protected by the lock. Note that register_root_hub() acts this way; it holds usb_bus_list_lock from before calling usb_new_device() to after setting hcd->rh_registered to 1. > > Does this all make sense? I realize it's rather obscure, and it > > doesn't help that the usages changed after the names were chosen. > > It does make sense, and I appreciate your patience. Thanks! You're welcome. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html