On Fri, 14 Jan 2011, Sarah Sharp wrote: > The setup that displays this issue is a HS mult-tt hub plugged into the > xHCI roothub. Sometimes when I quickly plug in a device into the HS hub > right after the hub (but not the bus) is suspended, the hub will resume, Actually the hub sends a wakeup request. The response of the xHCI controller is no doubt described in the xHCI specification, but I haven't read it. Some types of controller respond by automatically starting port-resume signalling and generating an IRQ, leaving it up to the driver to turn off the resume signal at the appropriate time. Other types merely report the wakeup request, leaving it up to the driver to begin the resume signalling. If the hub in question weren't attached directly to the root hub but instead was plugged into another HS hub, the response of the upstream hub would be to automatically carry out a port resume and then notify the host (by setting the Port-Suspend-Change status bit) after the resume was finished. > and then there will be a transfer error on the GetStatus for the HS hub. > This causes the USB core to start a reset-resume, without first checking > the port status of that port with a GetPortStatus call into the roothub. (Does "that port" refer to the root-hub port to which the HS hub is connected?) The log below does not agree with this verbal description. Maybe there is some confusion here; the Get-Port-Status call occurs _before_ the Get-Device-Status request and its transfer error. > The xHCI driver must time the port resume itself, and turn off resume > signaling after a period of time. It does that by keeping track of the > time to turn off resume signaling in an array called resume_done. It > relies on the GetPortStatus call by the USB core to check if needs to > clear the resume bit in the port status and clear the time in resume_done. > When the USB core sees an error on the GetStatus control transfer to the > HS hub, it starts a reset-resume, which will immediately set the reset bit > in the roothub port status registers, without issuing a GetPortStatus. The sequence of events does not occur as you describe. > This causes the time in resume_done to linger, despite the fact that the > device has been reset and is no longer resuming. This will cause the xHCI > bus suspend functions to not allow the roothub to suspend, since a device > is resuming: > > Jan 7 10:28:39 broadway kernel: [ 965.325547] hub 1-1:1.0: state 7 ports 4 chg 0000 evt 0000 > Jan 7 10:28:41 broadway kernel: [ 967.824026] hub 1-1:1.0: hub_suspend > Jan 7 10:28:41 broadway kernel: [ 967.840013] usb 1-1: usb auto-suspend > Jan 7 10:28:43 broadway kernel: [ 969.202878] Port Status Change Event for port 3 > Jan 7 10:28:43 broadway kernel: [ 969.202881] port resume event for port 3 > Jan 7 10:28:43 broadway kernel: [ 969.202884] resume HS port 3 Looks like xhci-hcd is a little confused regarding port numbering. Here it talks about port 3, and two lines below it says the actual port number is 0, whereas the event bit in the line below (and the "usb 1-1" device name) indicates that this is really happening on port 1. But this doesn't seem to be related to your problem. > Jan 7 10:28:43 broadway kernel: [ 969.202899] hub 1-0:1.0: state 7 ports 2 chg 0000 evt 0002 > Jan 7 10:28:43 broadway kernel: [ 969.202904] get port status, actual port 0 status = 0x400fe3 > Jan 7 10:28:43 broadway kernel: [ 969.216014] usb 1-1: usb wakeup-resume This is usbcore's response to the wakeup request; it calls usb_port_resume() and that routine starts by issuing the following Get-Port-Status request: > Jan 7 10:28:43 broadway kernel: [ 969.216019] get port status, actual port 0 status = 0xfe3 And this output is from the Get-Port-Status call which you claim was not performed. We can deduce from the log that the USB_PORT_STAT_SUSPEND status bit isn't set, indicating the port is no longer suspended, i.e., the hardware part of the resume is complete. Thus it is now okay to send a Get-Device-Status request. > Jan 7 10:28:43 broadway kernel: [ 969.216022] usb 1-1: finish resume > Jan 7 10:28:43 broadway kernel: [ 969.216380] xhci_hcd 0000:01:00.0: WARN: transfer error on endpoint > Jan 7 10:28:43 broadway kernel: [ 969.216389] usb 1-1: retry with reset-resume Get-Device-Status gets a transfer error, causing usbcore to do a reset-resume instead of a normal resume. However the "resume" part doesn't need to be carried out again, since it is already finished. Only the reset part still has to be done. > Jan 7 10:28:43 broadway kernel: [ 969.266718] Port Status Change Event for port 3 > Jan 7 10:28:43 broadway kernel: [ 969.272012] get port status, actual port 0 status = 0x200e03 > Jan 7 10:28:43 broadway kernel: [ 969.328299] usb 1-1: reset high speed USB device using xhci_hcd and address 4 > ... > Jan 7 10:28:47 broadway kernel: [ 973.844013] hub 1-1:1.0: hub_suspend > Jan 7 10:28:47 broadway kernel: [ 973.860015] usb 1-1: usb auto-suspend > Jan 7 10:28:49 broadway kernel: [ 975.876023] hub 1-0:1.0: hub_suspend > Jan 7 10:28:49 broadway kernel: [ 975.876030] usb usb1: bus auto-suspend > Jan 7 10:28:49 broadway kernel: [ 975.876033] suspend failed because port 1 is resuming > Jan 7 10:28:49 broadway kernel: [ 975.876035] usb usb1: bus suspend fail, err -16 > Jan 7 10:28:49 broadway kernel: [ 975.876037] hub 1-0:1.0: hub_resume > > The fix is to unconditionally clear the time in resume_done (and the other > associated bus state) if the USB core wants to set the port reset bit, and > the high speed device is still suspended. USB 3.0 devices do not suffer > from this issue, since their resume signaling is cleared automatically, > and the xHCI driver does not have to time the resume. At the time the reset began (and even before that, at the time the Get-Device-Status request was sent), resume_done should already have been cleared because the port was already fully resumed. Something else must be wrong. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html