On Sat, Apr 02, 2011 at 04:35:17AM -0400, CAI Qian wrote: > Hi Sarah, > > After applied this patch on the top of USB3 hub changes patches in my branch (can also reproduce the original problem), the problem is still existed. > > Freezing user space processes ... (elapsed 0.01 seconds) done. > Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done. > PM: Preallocating image memory... done (allocated 3924380 pages) > PM: Allocated 15697520 kbytes in 2.53 seconds (6204.55 MB/s) > hub 8-1:1.0: suspend error -16 > pm_op(): usb_dev_freeze+0x0/0x20 returns -16 > PM: Device 8-1 failed to freeze: error -16 So I think the 8-1 device is your roothub. There's a couple different places the code could return -EBUSY (which is what -16 is). In xhci-hub.c, there's if (hcd->self.root_hub->do_remote_wakeup) { port_index = max_ports; while (port_index--) { if (bus_state->resume_done[port_index] != 0) { spin_unlock_irqrestore(&xhci->lock, flags); xhci_dbg(xhci, "suspend failed because " "port %d is resuming\n", port_index + 1); return -EBUSY; } } } There's also various places in the USB core on the device, bus, and host controller suspend patch that can return -EBUSY. Can you enable CONFIG_USB_DEBUG and CONFIG_USB_XHCI_HCD_DEBUGGING and give me the dmesg output after you run `dmesg -n 8`? That will allow me to see if the bus suspend function in the xHCI driver is failing. If so, that could indicate something is wrong with the resume_done handling with the new split roothub code. I'm wondering if you're running into an issue I had trouble reproducing. I thought there might be an issue with resume_done in the xHCI code, but Andiry Xu and Alan Stern convinced me my reasoning had to be wrong about the cause of the bug, so I never included the bug fix patch in the final patchset. The patch is attached. The patch only fixes the resume_done issue for a high speed hub, but I could either modify the patch to work for USB 3.0 hubs, or perhaps your resume_done issue is with the high speed portion of your USB 3.0 hub. If, after you enable xHCI debugging and try to hibernate the system, you see a message from the xHCI driver about a particular port resuming, then try to apply that patch and see if it allows you suspend. > Restarting tasks ... done. > > Unfortunately, the latest Linus tree had a regression that prevent hibernate/resume from working even without those patches. CIA, what host controller are you running under? What brand of hub are you using? Andiry, you said the simpler patched allowed your system to suspend and hibernate. Which host are you using? Which hub? > It is now failing to resume every time tried after hibernate. > > PM: Starting manual resume from disk > Freezing user space processes ... > EXT4-fs (dm-0): INFO: recovery required on readonly filesystem > EXT4-fs (dm-0): write access will be enabled during recovery > EXT4-fs (dm-0): recovery complete > EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null) > (elapsed 0.18 seconds) done. > Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. > PM: Loading and decompressing image data (301765 pages) ... done > PM: Read 1207060 kbytes in 6.47 seconds (186.56 MB/s) > serial 00:08: disabled > > After applied this patch and the rest of Andiry's larger USB 3.0 hub power management patches on the top of linus tree, there is a conflict with patch [5/5] here. > > http://marc.info/?l=linux-usb&m=130136847804961&w=2 > > --- drivers/usb/core/hub.c > +++ drivers/usb/core/hub.c > @@ -2305,7 +2305,13 @@ > } > > /* see 7.1.7.6 */ > - status = set_port_feature(hub->hdev, port1, USB_PORT_FEAT_SUSPEND); > + if (hub_is_superspeed(hub->hdev)) > + status = set_port_feature(hub->hdev, > + port1 | (USB_SS_PORT_LS_U3 << 3), > + USB_PORT_FEAT_LINK_STATE); > + else > + status = set_port_feature(hub->hdev, port1, > + USB_PORT_FEAT_SUSPEND); > if (status) { > dev_dbg(hub->intfdev, "can't suspend port %d, status %d\n", > port1, status); > > > Do you have suggestion on which git tree/branch is the best to test those? Those patches should apply against the usb-next branch of Greg's tree: git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6.git However, he doesn't track Linus' tree very closely, so I don't think that branch is up-to-date right now. If you let me know when the other resume regression is fixed, then I can create apply the patchset on top of Linus' tree and create a branch for you to test. Sarah Sharp
>From 0e3395514321b57ed185edc3fe75a39189bf738d Mon Sep 17 00:00:00 2001 From: Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx> Date: Fri, 7 Jan 2011 11:13:00 -0800 Subject: [PATCH] xhci: Clear internal resume state on reset. The setup that displays this issue is a HS mult-tt hub plugged into the xHCI roothub. Sometimes when I quickly plug in a device into the HS hub right after the hub (but not the bus) is suspended, the hub will resume, and then there will be a transfer error on the GetStatus for the HS hub. This causes the USB core to start a reset-resume, without first checking the port status of that port with a GetPortStatus call into the roothub. The xHCI driver must time the port resume itself, and turn off resume signaling after a period of time. It does that by keeping track of the time to turn off resume signaling in an array called resume_done. It relies on the GetPortStatus call by the USB core to check if needs to clear the resume bit in the port status and clear the time in resume_done. When the USB core sees an error on the GetStatus control transfer to the HS hub, it starts a reset-resume, which will immediately set the reset bit in the roothub port status registers, without issuing a GetPortStatus. This causes the time in resume_done to linger, despite the fact that the device has been reset and is no longer resuming. This will cause the xHCI bus suspend functions to not allow the roothub to suspend, since a device is resuming: Jan 7 10:28:39 broadway kernel: [ 965.325547] hub 1-1:1.0: state 7 ports 4 chg 0000 evt 0000 Jan 7 10:28:41 broadway kernel: [ 967.824026] hub 1-1:1.0: hub_suspend Jan 7 10:28:41 broadway kernel: [ 967.840013] usb 1-1: usb auto-suspend Jan 7 10:28:43 broadway kernel: [ 969.202878] Port Status Change Event for port 3 Jan 7 10:28:43 broadway kernel: [ 969.202881] port resume event for port 3 Jan 7 10:28:43 broadway kernel: [ 969.202884] resume HS port 3 Jan 7 10:28:43 broadway kernel: [ 969.202899] hub 1-0:1.0: state 7 ports 2 chg 0000 evt 0002 Jan 7 10:28:43 broadway kernel: [ 969.202904] get port status, actual port 0 status = 0x400fe3 Jan 7 10:28:43 broadway kernel: [ 969.216014] usb 1-1: usb wakeup-resume Jan 7 10:28:43 broadway kernel: [ 969.216019] get port status, actual port 0 status = 0xfe3 Jan 7 10:28:43 broadway kernel: [ 969.216022] usb 1-1: finish resume Jan 7 10:28:43 broadway kernel: [ 969.216380] xhci_hcd 0000:01:00.0: WARN: transfer error on endpoint Jan 7 10:28:43 broadway kernel: [ 969.216389] usb 1-1: retry with reset-resume Jan 7 10:28:43 broadway kernel: [ 969.266718] Port Status Change Event for port 3 Jan 7 10:28:43 broadway kernel: [ 969.272012] get port status, actual port 0 status = 0x200e03 Jan 7 10:28:43 broadway kernel: [ 969.328299] usb 1-1: reset high speed USB device using xhci_hcd and address 4 ... Jan 7 10:28:47 broadway kernel: [ 973.844013] hub 1-1:1.0: hub_suspend Jan 7 10:28:47 broadway kernel: [ 973.860015] usb 1-1: usb auto-suspend Jan 7 10:28:49 broadway kernel: [ 975.876023] hub 1-0:1.0: hub_suspend Jan 7 10:28:49 broadway kernel: [ 975.876030] usb usb1: bus auto-suspend Jan 7 10:28:49 broadway kernel: [ 975.876033] suspend failed because port 1 is resuming Jan 7 10:28:49 broadway kernel: [ 975.876035] usb usb1: bus suspend fail, err -16 Jan 7 10:28:49 broadway kernel: [ 975.876037] hub 1-0:1.0: hub_resume The fix is to unconditionally clear the time in resume_done (and the other associated bus state) if the USB core wants to set the port reset bit, and the high speed device is still suspended. USB 3.0 devices do not suffer from this issue, since their resume signaling is cleared automatically, and the xHCI driver does not have to time the resume. Signed-off-by: Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx> --- drivers/usb/host/xhci-hub.c | 15 +++++++++++++++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c index dec0b97..6edc84d 100644 --- a/drivers/usb/host/xhci-hub.c +++ b/drivers/usb/host/xhci-hub.c @@ -555,6 +555,21 @@ int xhci_hub_control(struct usb_hcd *hcd, u16 typeReq, u16 wValue, xhci_writel(xhci, temp, port_array[wIndex]); temp = xhci_readl(xhci, port_array[wIndex]); + + /* HS reset-resume after a failed get status */ + if (!DEV_SUPERSPEED(temp) && + bus_state->resume_done[wIndex] != 0) { + bus_state->resume_done[wIndex] = 0; + slot_id = xhci_find_slot_id_by_port(hcd, xhci, + wIndex + 1); + if (!slot_id) { + xhci_dbg(xhci, "slot_id is zero\n"); + goto error; + } + xhci_ring_device(xhci, slot_id); + bus_state->port_c_suspend |= 1 << wIndex; + bus_state->suspended_ports &= ~(1 << wIndex); + } xhci_dbg(xhci, "set port reset, actual port %d status = 0x%x\n", wIndex, temp); break; default: -- 1.7.1