On Fri, Feb 7, 2014 at 2:11 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > On Fri, 31 Jan 2014, Dan Williams wrote: > >> In general we do not want khubd to act on port status changes that are >> the result of in progress resets or port pm runtime operations. > > The nomenclature is a little confusing. Let's agree that "port runtime > PM operations" will refer to usb_port_runtime_suspend/resume, and "USB > runtime PM operations" will refer to usb_port_suspend/resume. So now > this should say "in-progress resets or USB runtime PM operations". > >> Specifically port power control testing has been able to trigger an >> unintended disconnect in hub_port_connect_change(), paraphrasing: >> >> if ((portstatus & USB_PORT_STAT_CONNECTION) && udev && >> udev->state != USB_STATE_NOTATTACHED) { >> if (portstatus & USB_PORT_STAT_ENABLE) { >> /* Nothing to do */ >> } else if (udev->state == USB_STATE_SUSPENDED && >> udev->persist_enabled) { >> ... >> } else { >> /* Don't resuscitate */; >> } >> } >> >> ...by falling to the "Don't resuscitate" path or missing >> USB_PORT_STAT_CONNECTION becuase usb_port_resume() was in the middle of >> modifying the port status. >> >> The lock ordering rules are now usb_lock_device() => usb_lock_port(). > > Maybe you should mention the fact that you're introducing a new lock > before describing the ordering rule. Clarified the changelog accordingly... > >> This is mandated by the device core which may hold the device_lock on >> the usb_device before invoking usb_port_{suspend|resume} which in turn >> take the status_lock on the usb_port. We attempt to hold the >> status_lock for the duration of a port_event() run, and drop/re-aquire >> it when needing to take the device_lock. > > This patch should eliminate hub->busy_bits. That field is unnecessary > now that we have real mutual exlusion for port-status operations. Might move that to a later cleanup, but it's probably small enough to roll into this one. > >> @@ -4371,17 +4389,21 @@ hub_power_remaining (struct usb_hub *hub) >> * usb_reset_and_verify_device() encounters changed descriptors (as from >> * a firmware download) >> * caller already locked the hub >> + * >> + * It assumes the port is locked on entry and arranges for it to be >> + * always unlocked on exit > > I don't really like this, but I guess it's okay. > >> */ >> -static void hub_port_connect_change(struct usb_hub *hub, int port1, >> - u16 portstatus, u16 portchange) >> +static void hub_port_connect_change_unlock(struct usb_hub *hub, int port1, >> + u16 portstatus, u16 portchange) > > You should add sparse annotations stating that this routine will unlock > and lock port_dev in unexpected ways. (If there's any way to do that; > you may also have to tell sparse that usb_{un}lock_port is an > {un}locking operation.) > >> @@ -4639,9 +4671,11 @@ static int hub_handle_remote_wakeup(struct usb_hub *hub, unsigned int port, >> /* TRSMRCY = 10 msec */ >> msleep(10); >> >> + usb_unlock_port(port_dev); >> usb_lock_device(udev); >> ret = usb_remote_wakeup(udev); >> usb_unlock_device(udev); >> + usb_lock_port(port_dev); > > These things mean hub_handle_remote_wakeup needs the sparse annotations > too. > > Come to think of it, you can now move the device locking down into > usb_remote_wakeup. You could also move the port unlocking down there, > if you passed port_dev as an additional argument. (It would have to be > NULL in the call from hcd_resume_work.) I'll just do the usb_lock_device() push down. Sparse does not like conditional locking [1]. > >> @@ -4731,7 +4765,13 @@ static void port_event(struct usb_hub *hub, int port1) >> /* take port actions that require the port to be powered on */ >> pm_runtime_get_noresume(&port_dev->dev); >> pm_runtime_barrier(&port_dev->dev); >> - if (pm_runtime_active(&port_dev->dev)) { >> + usb_lock_port(port_dev); >> + do if (pm_runtime_active(&port_dev->dev)) { > > Please just use a normal "if" statement. With the change below I also removed this. > >> + /* re-read portstatus now that we are in-sync with >> + * usb_port_{suspend|resume} >> + */ >> + if (hub_port_status(hub, port1, &portstatus, &portchange) < 0) >> + break; > > For the most part, neither the pm_runtime_barrier nor the usb_lock_port > will have to wait. It seems unreasonable to read the port status a > second time when there's no need. > > Why not put the pm_runtime_get_noresume, pm_runtime_barrier, and > usb_lock_port calls at the start of this subroutine, just before the > first call to hub_port_status? Patch 9 could be altered accordingly. I now wrap port_event() with the lock to resolve this. > >> @@ -4751,17 +4791,22 @@ static void port_event(struct usb_hub *hub, int port1) >> if (status < 0) >> hub_port_disable(hub, port1, 1); >> } else { >> + usb_unlock_port(port_dev); >> usb_lock_device(udev); >> status = usb_reset_device(udev); >> usb_unlock_device(udev); >> + usb_lock_port(port_dev); > > Sparse annotations. > > You forgot to acquire the port lock in usb_reset_device. The lock and > unlock calls should surround the call to usb_reset_and_verify_device. > Both that routine and hub_port_init should be documented as requiring > their caller to hold the port lock. This might be a reason to move the lock to the usb_device itself. Otherwise we'll end up with a conditional lock when resetting root hub devices. Sparse will not be able to statically verify such a lock [1]. I had assumed that calliing usb_reset_device() on a root hub was not allowed. Hmm, maybe better to just up-level the udev->parent check to usb_reset_device directly. [1]: https://lwn.net/Articles/109066/ -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html