Re: [PATCH 2/2] USB: don't hold usb_bus_list_lock for usb_disconnect()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 13, 2013 at 8:11 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, 13 Sep 2013, Bjorn Helgaas wrote:
>
>> Thanks.  Maybe this is more relevant than I thought.  I'd sure like to
>> copy your strategy rather than reinvent something.
>
> Well, I don't know if this will really end up being all that relevant
> for PCI, but you're welcome to steal any ideas you like.  :-)
>
>> > Very briefly, usb_device_read() dereferences bus->root_hub:
>> >
>> >         usb_lock_device(bus->root_hub);
>> >
>> > On the other hand, usb_remove_hcd() calls both
>> >
>> >         usb_disconnect(&rhdev);
>> >
>> > and
>> >
>> >         usb_put_dev(hcd->self.root_hub);
>> >
>> > (note that rhdev == hcd->self.root_hub == bus->root_hub), which
>> > unregisters and deallocates the root_hub structure.  Something has to
>> > prevent usb_device_read() from accessing the structure after it has
>> > been unregistered and freed; that's where usb_bus_list_lock comes in.
>>
>> I hadn't paid attention to hcd->rh_registered before, but that seems
>> important.  When iterating over usb_bus_list, usb_device_read()
>> ignores buses with hcd->rh_registered == 0.
>
> Right.  In theory this could happen either because the root hub hasn't
> been registered yet or it has already been (or is about to be)
> unregistered.
>
>> usb_remove_hcd() clears hcd->rh_registered before deallocating the
>> root_hub with usb_put_dev().  I think the mutex_lock/unlock between
>> clearing rh_registered and deallocating root_hub will ensure that
>> usb_device_read() cannot see rh_registered == 1 and a deallocated
>> root_hub structure.
>
> Exactly.  You've got it.
>
>>  But that seems more like a memory barrier issue
>> than a locking issue, and I don't yet see how usb_disconnect() is
>> related.
>
> It's more than just a memory barrier.  Suppose usb_device_read() sees
> that rh_registered is set.  Then we don't want usb_remove_hcd() to
> deallocate the root hub until usb_device_read() is finished using it,
> and the way to prevent that is by making usb_device_read() hold
> usb_bus_list_lock somewhere in between setting rh_registered back to 0
> and the final usb_device_put().

This gets to the reason I'm interested -- we have PCI device removal
issues, and I have been thinking of them largely as reference counting
problems.  I'd like to make the assertion that if you have a pci_bus
or a pci_dev pointer, e.g., acquired by traversing a list similar to
usb_bus_list, you can assume the whole structure is valid, including
things it contains or points to.

But here you have the additional rh_registered flag, which basically
tells you whether bus->root_hub is valid.  My naive question is why
you don't wait to do the usb_device_put(hcd->self.root_hub) until
after you remove the bus from usb_bus_list.  If you could do that,
usb_device_read() would still need the lock, like all other readers of
usb_bus_list, but it wouldn't need to look at rh_registered.

> The fact that this "somewhere" surrounds a call to usb_disconnect() is
> more or less irrelevant.  It may have been more important in earlier
> kernel versions (i.e., before hcd->rh_registered was invented), I don't
> remember.  In theory, it would now be okay to move the usb_disconnect()
> call after the locked region, but it would look kind of strange to
> acquire the mutex and immediately release it without doing anything in
> between.  I guess the next best thing would be to hold
> usb_bus_list_lock instead across these lines:
>
>         spin_lock_irq (&hcd_root_hub_lock);
>         hcd->rh_registered = 0;
>         spin_unlock_irq (&hcd_root_hub_lock);
>
> That at least would make sense, and it would point out that
> hcd->rh_registered really is protected by both the mutex and the
> spinlock.

That would make a lot more sense to me.  The current code gives a
strong hint that the lock is related to something done in
usb_disconnect(), but that's really a pointer in the wrong direction.

>> I guess I'm hung up because I don't know how managing the set of root
>> hubs is different from managing usb_bus_list.  Maybe it has something
>> to do with the device_add() call in register_root_hub() and the
>> device_del() call in usb_disconnect().  We hold usb_bus_list_lock in
>> both cases, but neither does anything with usb_bus_list.  Or maybe it
>> has something to do with the visibility of stores to
>> hcd->rh_registered?
>
> Basically yes.  We need to prevent writes to hcd->rh_registered in both
> interrupt context (for example, in usb_hc_died()) and process context
> (usb_device_read()).  Therefore it has to be write-protected by both a
> spinlock and a mutex: hcd_root_hub_lock and usb_bus_list_lock.
>
> The bus list, by contrast, is used only in process context.  Therefore
> the mutex suffices to protect it.
>
> Does this all make sense?  I realize it's rather obscure, and it
> doesn't help that the usages changed after the names were chosen.

It does make sense, and I appreciate your patience.  Thanks!

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux