> -----Original Message----- > From: Sarah Sharp [mailto:sarah.a.sharp@xxxxxxxxxxxxxxx] > Sent: Saturday, January 15, 2011 6:00 AM > To: Xu, Andiry > Cc: linux-usb@xxxxxxxxxxxxxxx; Alan Stern > Subject: Issue with hub reset-resume under xHCI > > Hi Andiry, > > I just wanted to give you a heads up about a potential bug I spent some > time tracking down last week. It only showed up on the branch I've been > using for the split roothub patches, which is based on patches Greg sent > in for 2.6.38. Those patches include the update to make the USB core > use the runtime PM interface to suspend USB devices. I haven't been > able to reproduce this on a generic 2.6.37 kernel, so I suspect either > the split roothub patches or the runtime suspend update revealed the > bug. > > Here's the patch I created against my branch. The xHCI suspend/resume > variables were moved around into a bus_state structure, but I think you > can get the general gist of it. Can you look it over, and figure out if > this bug is possible in 2.6.38? I'm still not sure if it was something > I introduced with the split roothub patches. If it would show up in > 2.6.38, I'll revise the patch against 2.6.38. > I tried but have not reproduced this issue yet. So I will try to catch something from the demsg log... > > 8<-------------------------------------------------------------------->8 > From 0e3395514321b57ed185edc3fe75a39189bf738d Mon Sep 17 00:00:00 2001 > From: Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx> > Date: Fri, 7 Jan 2011 11:13:00 -0800 > Subject: [PATCH] xhci: Clear internal resume state on reset. > > The setup that displays this issue is a HS mult-tt hub plugged into the > xHCI roothub. Sometimes when I quickly plug in a device into the HS hub > right after the hub (but not the bus) is suspended, the hub will resume, > and then there will be a transfer error on the GetStatus for the HS hub. > This causes the USB core to start a reset-resume, without first checking > the port status of that port with a GetPortStatus call into the roothub. > > The xHCI driver must time the port resume itself, and turn off resume > signaling after a period of time. It does that by keeping track of the > time to turn off resume signaling in an array called resume_done. It > relies on the GetPortStatus call by the USB core to check if needs to > clear the resume bit in the port status and clear the time in resume_done. > When the USB core sees an error on the GetStatus control transfer to the > HS hub, it starts a reset-resume, which will immediately set the reset bit > in the roothub port status registers, without issuing a GetPortStatus. > > This causes the time in resume_done to linger, despite the fact that the > device has been reset and is no longer resuming. This will cause the xHCI > bus suspend functions to not allow the roothub to suspend, since a device > is resuming: > > Jan 7 10:28:39 broadway kernel: [ 965.325547] hub 1-1:1.0: state 7 ports > 4 chg 0000 evt 0000 > Jan 7 10:28:41 broadway kernel: [ 967.824026] hub 1-1:1.0: hub_suspend > Jan 7 10:28:41 broadway kernel: [ 967.840013] usb 1-1: usb auto-suspend > Jan 7 10:28:43 broadway kernel: [ 969.202878] Port Status Change Event > for port 3 > Jan 7 10:28:43 broadway kernel: [ 969.202881] port resume event for port > 3 > Jan 7 10:28:43 broadway kernel: [ 969.202884] resume HS port 3 I assume the port number 3 we got here is read directly from HW, which combines USB3 ports and USB2 ports. Since you have split the hub and assigned resume_done array to bus_state, the port number should be transformed to "faked port number" in the driver. Maybe resume_done array is set wrongly here. (A little suggestion: clarify the port numbers in print messages. Sometimes it's reported from HW, sometimes with base 1, and sometimes with base 0. This causes confusion) I see that you have a patch "Fix error in handle_port_status() on port resume" on your branch to fix this port number error. Does the patch helps on this issue? > Jan 7 10:28:43 broadway kernel: [ 969.202899] hub 1-0:1.0: state 7 ports > 2 chg 0000 evt 0002 > Jan 7 10:28:43 broadway kernel: [ 969.202904] get port status, actual > port 0 status = 0x400fe3 The port status indicates there is a Port Link State Change (U3->Resume) and the resume signal is on. > Jan 7 10:28:43 broadway kernel: [ 969.216014] usb 1-1: usb wakeup-resume > Jan 7 10:28:43 broadway kernel: [ 969.216019] get port status, actual > port 0 status = 0xfe3 PLC bit is cleared. But the resume signal is still on. > Jan 7 10:28:43 broadway kernel: [ 969.216022] usb 1-1: finish resume At this time xhci hub driver should have already cleared resume_done[wIndex], and wrote 0 to PLS field. > Jan 7 10:28:43 broadway kernel: [ 969.216380] xhci_hcd 0000:01:00.0: > WARN: transfer error on endpoint > Jan 7 10:28:43 broadway kernel: [ 969.216389] usb 1-1: retry with reset- > resume > Jan 7 10:28:43 broadway kernel: [ 969.266718] Port Status Change Event > for port 3 > Jan 7 10:28:43 broadway kernel: [ 969.272012] get port status, actual > port 0 status = 0x200e03 Here the port status indicates the port is in U0 state, and Port Reset Change is set, so the port has been reset. > Jan 7 10:28:43 broadway kernel: [ 969.328299] usb 1-1: reset high speed > USB device using xhci_hcd and address 4 > ... > Jan 7 10:28:47 broadway kernel: [ 973.844013] hub 1-1:1.0: hub_suspend > Jan 7 10:28:47 broadway kernel: [ 973.860015] usb 1-1: usb auto-suspend > Jan 7 10:28:49 broadway kernel: [ 975.876023] hub 1-0:1.0: hub_suspend > Jan 7 10:28:49 broadway kernel: [ 975.876030] usb usb1: bus auto-suspend > Jan 7 10:28:49 broadway kernel: [ 975.876033] suspend failed because > port 1 is resuming > Jan 7 10:28:49 broadway kernel: [ 975.876035] usb usb1: bus suspend fail, > err -16 Resume_done[0] is not zero, so bus suspend fails. It should be clear at this moment. I think Alan is right that resume_done should be already clear to 0 before the port reset happens. Something is wrong but I don't think it's related to port reset. I think we should monitor the time resume_done is set and clear, make sure it's set to the right port, and make sure driver clears the resume signal in GetPortStatus. Thanks, Andiry -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html