On Wed, 2 May 2018, Mathias Nyman wrote: > On 24.04.2018 16:50, Alan Stern wrote: > > On Tue, 24 Apr 2018, Mathias Nyman wrote: > > > >>>>> In this situation, the HCD_WAKEUP_PENDING(hcd) test in > >>>>> hcd-pci.c:suspend_common() should prevent the controller from going > >>>>> back into D3. The WAKEUP_PENDING bit gets set in > >>>>> usb_hcd_resume_root_hub() and it doesn't get cleared until > >>>>> hcd_bus_resume() runs. > >>>>> > >>>> > >>>> I think xhci never calls usb_hcd_resume_root_hub() in xhci_resume() in this > >>>> specific failing case > >>>> > >>>> xhci_resume() has a check: > >>>> /* Resume root hubs only when have pending events. */ > >>>> status = readl(&xhci->op_regs->status); > >>>> if (status & STS_EINT) { > >>>> usb_hcd_resume_root_hub(xhci->shared_hcd); > >>>> usb_hcd_resume_root_hub(hcd); > >>>> } > >>>> > >>>> If the check fails, then WAKEUP_PENDING bit is not set, and runtime PM > >>>> can suspend host controller again. when xhci driver finally gets to handle the interrupt > >>>> the controller may be in D3 already > >>>> > >>>> This should only happen if xhci_resume() is called before xhci driver sees a pending interrupt, > >>>> could be possible as xhci has interrupt moderation enabled. > >>> > >>> Then maybe that test should be removed. Calling > >>> usb_hcd_resume_root_hub() for every wakeup shouldn't be too bad, > >>> because there probably are not very many times when the controller gets > >>> resumed without the root hub also being resumed. > >>> > >> > >> The check was added to fix system suspend issue on a runtime suspended host: > >> > >> commit d6236f6d1d885aa19d1cd7317346fe795227a3cc > >> > >> xhci: Fix runtime suspended xhci from blocking system suspend. > >> > >> The system suspend flow as following: > >> 1, Freeze all user processes and kenrel threads. > >> > >> 2, Try to suspend all devices. > >> > >> 2.1, If pci device is in RPM suspended state, then pci driver will try > >> to resume it to RPM active state in the prepare stage. > >> > >> 2.2, xhci_resume function calls usb_hcd_resume_root_hub to queue two > >> workqueue items to resume usb2&usb3 roothub devices. > >> > >> 2.3, Call suspend callbacks of devices. > >> > >> 2.3.1, All suspend callbacks of all hcd's children, including > >> roothub devices are called. > >> > >> 2.3.2, Finally, hcd_pci_suspend callback is called. > >> > >> Due to workqueue threads were already frozen in step 1, the workqueue > >> items can't be scheduled, and the roothub devices can't be resumed in > >> this flow. The HCD_FLAG_WAKEUP_PENDING flag which is set in > >> usb_hcd_resume_root_hub won't be cleared. Finally, > >> hcd_pci_suspend will return -EBUSY, and system suspend fails. > > > > Hmmm. I don't recall seeing this problem occur with ehci-hcd. But > > then, I haven't tested it very much recently. > > > > We could change to a different work queue, one that doesn't get > > frozen. But there's no guarantee that the work items would run before > > your step 2.3.2. > > > > Maybe we can avoid step 2.1. I think there have been some recent > > changes to the PM code in this area. There may be a flag you can set > > that will prevent the PCI core from resuming the host controller. > > > > Or maybe we can change step 2.3.1, so that the root hub's suspend > > callback will first do a resume if the WAKEUP_PENDING flag is set. > > That might be the most reliable approach. > > > > I'm not sure I understand the last suggestion, could you open up how it > would work? Here's what I had in mind. See if you think this would work. Consider choose_wakeup() in core/driver.c. That subroutine gets called by usb_suspend() when step 2.3.1 wants to suspend a USB device. We could patch it as follows: --- usb-4.x.orig/drivers/usb/core/driver.c +++ usb-4.x/drivers/usb/core/driver.c @@ -1449,11 +1449,21 @@ static void choose_wakeup(struct usb_dev */ w = device_may_wakeup(&udev->dev); - /* If the device is autosuspended with the wrong wakeup setting, + /* + * If the device is autosuspended with the wrong wakeup setting, * autoresume now so the setting can be changed. + * + * Likewise, if the device is an autosuspended root hub and the + * hcd needs to wake it up before the controller can be suspended, + * resume it now to clear the WAKEUP_PENDING flag. */ - if (udev->state == USB_STATE_SUSPENDED && w != udev->do_remote_wakeup) - pm_runtime_resume(&udev->dev); + if (udev->state == USB_STATE_SUSPENDED) { + struct usb_hcd *hcd = bus_to_hcd(udev->bus); + + if (w != udev->do_remote_wakeup || + (!udev->parent && HCD_WAKEUP_PENDING(hcd))) + pm_runtime_resume(&udev->dev); + } udev->do_remote_wakeup = w; } > I started approaching this from another direction, mainly making sure we don't > immediately runtime suspend the host controller after resume. Adding a next_statechange > minimal time between resume and suspend, and checking for pending events in xhci_suspend(). > > I'll have some patches to test for russianneuromancer@xxxxx soon > > These are generic checks that ehci_suspend() also does To tell the truth, I'm not sure how necessary those next_statechange tests really are. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html