On 14.4.2022 19.30, Evan Green wrote: > Hi Alan and Mathias, > > On Thu, Apr 14, 2022 at 7:21 AM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: >> >> On Thu, Apr 14, 2022 at 05:00:12PM +0300, Mathias Nyman wrote: >>> On 12.4.2022 18.40, Alan Stern wrote: >>>> On Tue, Apr 12, 2022 at 05:56:42PM +0300, Mathias Nyman wrote: >>>>> On 11.4.2022 17.50, Alan Stern wrote: >>>>>> For example, what would happen if the user unplugs a device right in the >>>>>> middle of the freeze transition, after the root hub has been frozen but >>>>>> before the controller is frozen? We don't want such an unplug event to >>>>>> prevent the system from going into hibernation -- especially if the root >>>>>> hub was not enabled for wakeup. >>>>> >>>>> We should be able to let system go to hibernate even if we get a disconnect >>>>> interrupt between roothub and host controller freeze. >>>>> Host is not yet suspended so no PME# wake is generated, only an interrupt. >>>>> >>>>> From Linux PM point of view it should be ok as well as the actual xhci >>>>> device that is generating the interrupt is hasnt completer freeze() >>>>> >>>>> The xhci interrupt handler just needs to make sure that the disconnect >>>>> isn't propagated if roothub is suspended and wake on disconnect >>>>> is not set. And definitely make sure xhci doesn't start roothub polling. >>>>> >>>>> When freeze() is called for the host we should prevent the host from >>>>> generating interrupts. >>>> >>>> I guess that means adding a new callback. Or we could just suspend the >>>> controller, like Evan proposed originally >>> >>> Suspending the host in freeze should work. >>> It will do an extra xhci controller state save stage, but that should be harmless. >>> >>> But is there really a need for the suggested noirq part? >>> >>> + .freeze_noirq = hcd_pci_suspend_noirq, >>> >>> That will try to set the host to PCI D3 state. >>> It seems a bit unnecessary for freeze. >> >> Agreed. >> >>>>>> (If the root hub _is_ enabled for wakeup then it's questionable. >>>>>> Unplugging a device would be a wakeup event, so you could easily argue >>>>>> that it _should_ prevent the system from going into hibernation. After >>>>>> all, if the unplug happened a few milliseconds later, after the system >>>>>> had fully gone into hibernation, then it would cause the system to wake >>>>>> up.) >>>>>> >>>>>>> Would it make sense prevent xHCI interrupt generation in the host >>>>>>> freeze() stage, clearing the xHCI EINT bit in addition to calling >>>>>>> check_roothub_suspend()? >>>>>>> Then enable it back in thaw() >>>>>> >>>>>> That won't fully eliminate the problem mentioned in the preceding >>>>>> paragraphs, although I guess it would help somewhat. >>>>> >>>>> Would the following steps solve this? >>>>> >>>>> 1. Disable device initiated resume for connected usb devices in freeze() >>>>> >>>>> 2. Don't propagate connect or OC changes if roothub is suspended and port wake >>>>> flags are disabled. I.E don't kick roothub polling in xhci interrupt >>>>> handler here. >>>> >>>> I guess you can't just halt the entire host controller when only one of >>>> the root hubs is suspended with wakeup disabled. That does complicate >>>> things. But you could halt it as soon as both of the root hubs are >>>> frozen. Wouldn't that prevent interrupt generation? >>> >>> True, but probably easier to just suspend host in freeze() as you stated above. >> >> Okay. >> >> Evan, this discussion suggests that you rewrite your patch as a series >> of three: >> >> 1. Change choose_wakeup() so that for PM_EVENT_FREEZE, wakeup is >> always disabled. > > If I understand this correctly, this means potentially runtime > resuming the device so its wakeup setting can be consistently set to > wakeups disabled across a freeze transition. Got it I think in terms > of the "how". > >> >> 2. Change the xhci-hcd interrupt handler so that port-status >> changes are ignored if the port's root hub is suspended with >> wakeup disabled. > > This part confuses me. This would be way deep under > xhci_handle_event(), probably in handle_port_status(), just throwing > away certain events that come in the ring. How would we know to go > back and process those events later? I think we don't need to do this > if we suspend the controller as in #3 below. The suspended (halted) > controller wouldn't generate event interrupts (since the spec mentions > port status change generation is gated on HCHalted). So we're already > covered against receiving interrupts in this zone by halting the > controller, and the events stay nicely pending for when we restart it > in thaw. Was thinking the same here. It would be nice to have this to comply with usb spec, keeping roothub from propagating connect/disconnect events immediately after suspending it with wake flags cleared. But it's a lot of work to implement this, and for this issue, and linux hibernate point of view I don't think it has any real benefit. The actual device generating the interrupt is the host (parent of roothub), and that will stop once freeze() is called for it in #3 > > Is the goal of #1 purely a setup change for #2, or does it stand on > its own even if we nixed #2? Said differently, is #1 trying to ensure > that wake signaling doesn't occur at all between freeze and thaw, even > when the controller is suspended and guaranteed not to generate > interrupts via its "normal" mechanism? I don't have a crisp mental > picture of how the wake signaling works, but if the controller wake > mechanism sidesteps the original problem of sending an MSI to a dead > CPU (as in, it does not use MSIs), then it might be ok as-is. #1 is needed because xHCI can generate wake events even when halted if device initiated resume signaling is detected on a roothub port. Just like it can generate wake events on connect/disconnect if wake flags are set. (xhci spec figure 4-34, see PLS=Resume) We want to avoid those wakeups between freeze-thaw So just #1 and #3 should probably solve this, and be an easier change. -Mathias > >> >> 3. As in the original patch, make the .freeze and .thaw callbacks >> in hcd-pci.c call the appropriate suspend and resume routines, >> but don't do anything for .freeze_noirq and .thaw_noirq. > > Sure. I had made the _noirq paths match suspend for consistency, I > wasn't sure if those could mix n match without issues. I'll try it out > leaving the _noirq callbacks alone. > -Evan > >> >> How does that sound? >> >> Alan Stern