On Mon, 2019-01-28 at 14:39 +0200, Mathias Nyman wrote: > On 26.01.2019 15:57, Greg KH wrote: > > On Sat, Jan 26, 2019 at 02:56:24PM +0100, Greg KH wrote: > > > On Fri, Jan 25, 2019 at 03:02:11PM -0800, Todd Brandt wrote: > > > > Hi Greg, we run weekly 48-hour S3/S2idle stress tests on each > > > > release > > > > candidate, and I found a single instance (out of 2794 runs) of > > > > this > > > > warning on the Lenovo Yoga 920 with linux 5.0.0-rc2 running S3 > > > > suspend/resume. > > > > > > > > I noticed a similar issue in bugzilla and you commented that > > > > these > > > > issues should be sent out over the mailing list instead, thus > > > > I'm > > > > sending this via mail. I've attached the full sleepgraph > > > > timeline to > > > > the mail, but it might not make it through the mailing list. > > > > > > > > Full sleepgraph timeline is here (dmesg snippet below): > > > > https://01org.github.io/pm-graph/suspend-190120-174105/otcpl-yo > > > > ga-920-k > > > > blr_mem.html > > > > > > Heikki is the best one to help out with this... > > > > Ugh, I meant to say "Mathias", sorry about that. > > > > The USB xhci controller in Alpine Ridge will be PCI hotplug removed > from the > PCI bus if no usb device is connected to the type-c port. > > This is the case for you when you type lsusb, you only see the PCH > xhci at > xhci_hcd 0000:00:14.0 (bus1 and bus2), not the Alpine Ridge xhci_hcd > 0000:38:00.0 > (bus3 bus4) > > In the failing case it looks like the Alpine Ridge xhci controller is > visible > at suspend, and it suspends fine, but at resume it first fails to > enter D0 state, > (might be removed already here) and then the controller is hotplug > removed. > > Suspending: > > [ 28.001067] xhci_hcd 0000:38:00.0: calling > pci_pm_suspend_noirq+0x0/0x210 @ 1777, parent: 0000:02:02.0 > ... > [ 28.227968] xhci_hcd 0000:38:00.0: pci_pm_suspend_noirq+0x0/0x210 > returned 0 after 221570 usecs > [ 28.227996] pcieport 0000:02:02.0: calling > pci_pm_suspend_noirq+0x0/0x210 @ 7, parent: 0000:01:00.0 > [ 28.247837] pcieport 0000:02:02.0: pci_pm_suspend_noirq+0x0/0x210 > returned 0 after 19365 usecs > [ 28.247863] pcieport 0000:01:00.0: calling > pci_pm_suspend_noirq+0x0/0x210 @ 1778, parent: 0000:00:1c.0 > [ 28.267870] pcieport 0000:01:00.0: pci_pm_suspend_noirq+0x0/0x210 > returned 0 after 19528 usecs > [ 28.267890] pcieport 0000:00:1c.0: calling > pci_pm_suspend_noirq+0x0/0x210 @ 1784, parent: pci0000:00 > [ 28.267962] pcieport 0000:00:1c.0: pci_pm_suspend_noirq+0x0/0x210 > returned 0 after 62 usecs > [ 28.267993] PM: noirq suspend of devices complete after 267.095 > msecs > [ 28.268987] ACPI: Preparing to enter system sleep state S3 > > Resuming: Fails to get xhci back to D0 > > [ 28.367595] xhci_hcd 0000:38:00.0: calling > pci_pm_resume_noirq+0x0/0xf0 @ 1827, parent: 0000:02:02.0 > [ 28.383764] thunderbolt 0000:03:00.0: Refused to change power > state, currently in D3 > [ 28.445767] thunderbolt 0000:03:00.0: pci_pm_resume_noirq+0x0/0xf0 > returned 0 after 76366 usecs > [ 28.591777] xhci_hcd 0000:38:00.0: Refused to change power state, > currently in D3 > [ 28.653669] xhci_hcd 0000:38:00.0: pci_pm_resume_noirq+0x0/0xf0 > returned 0 after 279366 usecs > > Alpine Ridge xhci is then PCI hotplug removed, > state 4 means last stored software state for host is suspended. > > [ 29.873614] xhci_hcd 0000:38:00.0: remove, state 4 > [ 29.873619] usb usb4: USB disconnect, device number 1 > [ 29.873749] xhci_hcd 0000:38:00.0: USB bus 4 deregistered > [ 29.873752] xhci_hcd 0000:38:00.0: remove, state 4 > [ 29.873755] usb usb3: USB disconnect, device number 1 > [ 29.873996] xhci_hcd 0000:38:00.0: Host halt failed, -19 > [ 29.874006] xhci_hcd 0000:38:00.0: Host not accessible, reset > failed. > [ 29.874382] xhci_hcd 0000:38:00.0: USB bus 3 deregistered > [ 29.874417] ------------[ cut here ]------------ > [ 29.874420] xhci_hcd 0000:38:00.0: disabling already-disabled > device > [ 29.874528] WARNING: CPU: 0 PID: 169 at drivers/pci/pci.c:1870 > pci_disable_device+0x9c/0xc0 > > > The warning comes from the fact that pci_disable_device() is called > when a PCI xhci > controller is suspended in hcd-pci.c suspend_common(), and then again > at remove in usb_hcd_pci_remove(). > > Not sure if the issue here is that PCI layer fails to resume xhci to > D0, "enabling the device" > or USB core should check if the pci device exists before disabling > it. Or some > generic issue about PCI devices being hotplug removed while in S3. > > -Mathias Thanks for the info, we'll keep an eye on it. It only happened just this once.