On 26.01.2019 15:57, Greg KH wrote:
On Sat, Jan 26, 2019 at 02:56:24PM +0100, Greg KH wrote:
On Fri, Jan 25, 2019 at 03:02:11PM -0800, Todd Brandt wrote:
Hi Greg, we run weekly 48-hour S3/S2idle stress tests on each release
candidate, and I found a single instance (out of 2794 runs) of this
warning on the Lenovo Yoga 920 with linux 5.0.0-rc2 running S3
suspend/resume.
I noticed a similar issue in bugzilla and you commented that these
issues should be sent out over the mailing list instead, thus I'm
sending this via mail. I've attached the full sleepgraph timeline to
the mail, but it might not make it through the mailing list.
Full sleepgraph timeline is here (dmesg snippet below):
https://01org.github.io/pm-graph/suspend-190120-174105/otcpl-yoga-920-k
blr_mem.html
Heikki is the best one to help out with this...
Ugh, I meant to say "Mathias", sorry about that.
The USB xhci controller in Alpine Ridge will be PCI hotplug removed from the
PCI bus if no usb device is connected to the type-c port.
This is the case for you when you type lsusb, you only see the PCH xhci at
xhci_hcd 0000:00:14.0 (bus1 and bus2), not the Alpine Ridge xhci_hcd 0000:38:00.0
(bus3 bus4)
In the failing case it looks like the Alpine Ridge xhci controller is visible
at suspend, and it suspends fine, but at resume it first fails to enter D0 state,
(might be removed already here) and then the controller is hotplug removed.
Suspending:
[ 28.001067] xhci_hcd 0000:38:00.0: calling pci_pm_suspend_noirq+0x0/0x210 @ 1777, parent: 0000:02:02.0
...
[ 28.227968] xhci_hcd 0000:38:00.0: pci_pm_suspend_noirq+0x0/0x210 returned 0 after 221570 usecs
[ 28.227996] pcieport 0000:02:02.0: calling pci_pm_suspend_noirq+0x0/0x210 @ 7, parent: 0000:01:00.0
[ 28.247837] pcieport 0000:02:02.0: pci_pm_suspend_noirq+0x0/0x210 returned 0 after 19365 usecs
[ 28.247863] pcieport 0000:01:00.0: calling pci_pm_suspend_noirq+0x0/0x210 @ 1778, parent: 0000:00:1c.0
[ 28.267870] pcieport 0000:01:00.0: pci_pm_suspend_noirq+0x0/0x210 returned 0 after 19528 usecs
[ 28.267890] pcieport 0000:00:1c.0: calling pci_pm_suspend_noirq+0x0/0x210 @ 1784, parent: pci0000:00
[ 28.267962] pcieport 0000:00:1c.0: pci_pm_suspend_noirq+0x0/0x210 returned 0 after 62 usecs
[ 28.267993] PM: noirq suspend of devices complete after 267.095 msecs
[ 28.268987] ACPI: Preparing to enter system sleep state S3
Resuming: Fails to get xhci back to D0
[ 28.367595] xhci_hcd 0000:38:00.0: calling pci_pm_resume_noirq+0x0/0xf0 @ 1827, parent: 0000:02:02.0
[ 28.383764] thunderbolt 0000:03:00.0: Refused to change power state, currently in D3
[ 28.445767] thunderbolt 0000:03:00.0: pci_pm_resume_noirq+0x0/0xf0 returned 0 after 76366 usecs
[ 28.591777] xhci_hcd 0000:38:00.0: Refused to change power state, currently in D3
[ 28.653669] xhci_hcd 0000:38:00.0: pci_pm_resume_noirq+0x0/0xf0 returned 0 after 279366 usecs
Alpine Ridge xhci is then PCI hotplug removed,
state 4 means last stored software state for host is suspended.
[ 29.873614] xhci_hcd 0000:38:00.0: remove, state 4
[ 29.873619] usb usb4: USB disconnect, device number 1
[ 29.873749] xhci_hcd 0000:38:00.0: USB bus 4 deregistered
[ 29.873752] xhci_hcd 0000:38:00.0: remove, state 4
[ 29.873755] usb usb3: USB disconnect, device number 1
[ 29.873996] xhci_hcd 0000:38:00.0: Host halt failed, -19
[ 29.874006] xhci_hcd 0000:38:00.0: Host not accessible, reset failed.
[ 29.874382] xhci_hcd 0000:38:00.0: USB bus 3 deregistered
[ 29.874417] ------------[ cut here ]------------
[ 29.874420] xhci_hcd 0000:38:00.0: disabling already-disabled device
[ 29.874528] WARNING: CPU: 0 PID: 169 at drivers/pci/pci.c:1870 pci_disable_device+0x9c/0xc0
The warning comes from the fact that pci_disable_device() is called when a PCI xhci
controller is suspended in hcd-pci.c suspend_common(), and then again at remove in usb_hcd_pci_remove().
Not sure if the issue here is that PCI layer fails to resume xhci to D0, "enabling the device"
or USB core should check if the pci device exists before disabling it. Or some
generic issue about PCI devices being hotplug removed while in S3.
-Mathias