PCI runtime PM issue on NEC xHCI host

Sarah Sharp <sarah.a.sharp@xxxxxxxxx> · Tue, 1 Nov 2011 07:28:16 -0700

Linus, Keith, and Paul,

Can you try the attached patch and see if it helps your issues with
enabling runtime PM for the NEC USB 3.0 host controller?  You may have
to increase the timeout for your system.


Hi Rafael,

I mentioned I've been seeing failures when PCI runtime PM was enabled on
the xHCI host controller in my Lenovo x220.  It turns out that several
people have the same NEC chipset in other Lenovo models, and are also
experiencing similar issues.  When runtime PM is enabled for the host,
and you plug in a device, we get a steady stream of:

Oct 31 16:11:31 puck kernel: [ 5196.504007] xhci_hcd 0000:0e:00.0: PME# enabled
Oct 31 16:11:40 puck kernel: [ 5205.503185] xhci_hcd 0000:0e:00.0: PME# disabled
Oct 31 16:11:40 puck kernel: [ 5205.503202] xhci_hcd 0000:0e:00.0: setting latency timer to 64

Those three lines are repeated, but the new USB device never enumerates.

I've discovered that if I add a small delay in the xhci_resume function
(which is invoked by the dev_pm_ops resume method), then the host
controller has time to issue an interrupt about the port status change
before the runtime PM core puts it back into PCI suspend.

I'm running Linus' latest tree as of yesterday (commit 839d8810), which
includes your commit 379021d "PCI / PM: Extend PME polling to all PCI
devices".

Without any delays:

Oct 31 16:11:31 puck kernel: [ 5196.484738] xhci_suspend started
Oct 31 16:11:31 puck kernel: [ 5196.503891] xhci_hcd 0000:0e:00.0: hcd_pci_runtime_suspend: 0
Oct 31 16:11:31 puck kernel: [ 5196.504007] xhci_hcd 0000:0e:00.0: PME# enabled
Oct 31 16:11:40 puck kernel: [ 5205.503185] xhci_hcd 0000:0e:00.0: PME# disabled
Oct 31 16:11:40 puck kernel: [ 5205.503202] xhci_hcd 0000:0e:00.0: setting latency timer to 64
Oct 31 16:11:40 puck kernel: [ 5205.503207] xhci_resume started
Oct 31 16:11:40 puck kernel: [ 5205.503247] xhci_resume done
Oct 31 16:11:40 puck kernel: [ 5205.503248] xhci_hcd 0000:0e:00.0: hcd_pci_runtime_resume: 0
Oct 31 16:11:40 puck kernel: [ 5205.503255] xhci_suspend started
Oct 31 16:11:40 puck kernel: [ 5205.509289] xhci_hcd 0000:0e:00.0: hcd_pci_runtime_suspend: 0
Oct 31 16:11:40 puck kernel: [ 5205.509359] xhci_hcd 0000:0e:00.0: PME# enabled

Note the very small amount of time between xhci_resume being done and
xhci_suspend being started.

If I put a delay at the end of xhci_resume (as small as 30us) before
returning success, the interrupt for the port status change has time to
come in:

Nov  1 06:45:19 puck kernel: [57557.041234] xhci_suspend started
Nov  1 06:45:19 puck kernel: [57557.059419] xhci_hcd 0000:0e:00.0: hcd_pci_runtime_suspend: 0
Nov  1 06:45:19 puck kernel: [57557.059513] xhci_hcd 0000:0e:00.0: PME# enabled
Nov  1 06:45:27 puck kernel: [57565.060976] xhci_hcd 0000:0e:00.0: PME# disabled
Nov  1 06:45:27 puck kernel: [57565.060994] xhci_hcd 0000:0e:00.0: setting latency timer to 64
Nov  1 06:45:27 puck kernel: [57565.060999] xhci_resume started
Nov  1 06:45:27 puck kernel: [57565.061039] xhci_resume done
Nov  1 06:45:27 puck kernel: [57565.061071] xhci_hcd 0000:0e:00.0: hcd_pci_runtime_resume: 0
Nov  1 06:45:27 puck kernel: [57565.061079] xhci_suspend started
Nov  1 06:45:27 puck kernel: [57565.061082] xhci_irq
Nov  1 06:45:27 puck kernel: [57565.061085] Port Status Change Event for port 3
Nov  1 06:45:27 puck kernel: [57565.067076] xhci_resume started
Nov  1 06:45:27 puck kernel: [57565.067119] xhci_resume done
Nov  1 06:45:27 puck kernel: [57565.067150] xhci_hcd 0000:0e:00.0: hcd_pci_runtime_suspend: -16
Nov  1 06:45:27 puck kernel: [57565.067158] pci_pm_runtime_suspend(): hcd_pci_runtime_suspend+0x0/0x50 returns -16

Maybe resume takes too long, and the runtime PM system needs to mark the device
as busy when the runtime_resume function returns?  Catting the
power/autosuspend_delay_ms file returns an I/O error:

root@puck:/sys/bus/pci/devices/0000:0e:00.0# cat power/autosuspend_delay_ms
cat: power/autosuspend_delay_ms: Input/output error

Any thoughts on this?  I can certainly add the attached patch to the
xHCI driver, but it seems like there is something fundamentally broken
in the runtime PM core if it's suspending devices without giving them a
chance to interrupt.

Sarah Sharp

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 1ff95a0..e238360 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -914,6 +914,9 @@ failed_restart:
 	set_bit(HCD_FLAG_HW_ACCESSIBLE, &xhci->shared_hcd->flags);
 
 	spin_unlock_irq(&xhci->lock);
+
+	if (xhci->quirks & XHCI_NEC_HOST)
+		udelay(30);
 	return 0;
 }
 #endif	/* CONFIG_PM */