在 2024-09-12星期四的 11:00 -0400,Alan Stern写道: > On Thu, Sep 12, 2024 at 11:21:26AM +0800, duanchenghao wrote: > > 在 2024-09-11星期三的 10:40 -0400,Alan Stern写道: > > > On Tue, Sep 10, 2024 at 05:36:56PM +0800, duanchenghao wrote: > > > > S4 wakeup restores the image that was saved before the system > > > > entered > > > > the S4 sleep state. > > > > > > > > S4 waking up from hibernation > > > > ============================= > > > > kernel initialization > > > > | > > > > v > > > > freeze user task and kernel thread > > > > | > > > > v > > > > load saved image > > > > | > > > > v > > > > freeze the peripheral device and controller > > > > (Check the HCD_FLAG_WAKEUP_ PENDING flag of the USB. If it > > > > is > > > > set, > > > > return to EBUSY and do not perform the following restore > > > > image.) > > > > > > Why is the flag set at this point? It should not be; the device > > > and > > > controller should have been frozen with wakeup disabled. > > > > > This is check point, not set point. > > Yes, I know that. But when the flag was checked, why did the code > find > that it was set? The flag should have been clear. Yes, the current issue is that during S4 testing, there is a probabilistic scenario where clear_bit is not called after set_bit, or clear_bit is called but does not execute after set_bit. Please refer to the two modification points in the v2 patch for details, as both of them can cause the current issue. > > > > Is your problem related to the one discussed in this email > > > thread? > > > > > > https://lore.kernel.org/linux-usb/d8600868-6e4b-45ab-b328-852b6ac8ecb5@xxxxxxxxxxxxxxxxxxx/ > > > > > > Would the suggestion I made there -- i.e., have the xhci-hcd > > > interrupt handler skip calling usb_hcd_resume_root_hub() if the > > > root > > > hub > > > was suspended with wakeup = 0 -- fix your problem? > > > > Skipping usb_hcd_resume_root_hub() should generally be possible, > > but > > it's important to ensure that normal remote wakeup functionality is > > not > > compromised. Is it HUB_SUSPEND that the hub you are referring to is > > in > > a suspended state? > > I don't understand this question. hub_quiesce() gets called with > HUB_SUSPEND when the hub enters a suspended state. > > You are correct about the need for normal remote wakeup to work > properly. The interrupt handler should skip calling > usb_hcd_resume_root_hub() for port connect or disconnect changes and > for > port overcurrent changes (when the root hub is suspended with wakeup > = > 0). But it should _not_ skip calling usb_hcd_resume_root_hub() for > port > resume events. The current issue arises when rh_state is detected as RH_SUSPEND and usb_hcd_resume_root_hub() is called to resume the root hub. However, there is no mutual exclusion between the suspend flag, set_bit, and clear_bit, which can lead to two scenarios: 1. After set_bit is called, the state of the USB device is modified by another process to !USB_STATE_SUSPEND, preventing the hub's resume from being executed, and consequently, clear_bit is not called again. 2. In another scenario, during the hub resume process, after HCD_FLAG_WAKEUP_PENDING is cleared by clear_bit, rh_state has not yet been set to !RH_SUSPENDED. At this point, set_bit is executed, but since the hub has already entered the running state, the clear_bit associated with the resume operation is not executed. Please review the v2 patch, where I have described both the logical flow before the modification and the revised logical flow after the modification. > > Alan Stern