On Wed, Sep 25, 2024 at 10:50:41AM +0800, dengjie wrote: > Reproduction of the problem: During the S4 stress test, when a USB device is inserted or > removed, there is a probability that the S4 wakeup will turn into a reboot.The following > two points describe how to analyze and locate the problem points: > > 1. During the boot stage when S4 is awakened, after the USB RootHub is initialized, > it will enter the runtime suspend state. From then on, whenever an xhci port change > event occurs, it will trigger a remote wakeup request event and add wakeup_work > to pm_wq, where the subsequent RootHub runtime resume process will be handled by pm_wq. > > xhci runtime suspend flow: > S4 boot > |->xhci init > |->register_root_hub > |->hub_probe > |->callback = RPM_GET_CALLBACK(dev, runtime_suspend) /* xhci RootHub runtime suspend */ > > xhci runtime resume flow : > xhci_irq() > |->xhci_handle_event() > |->handle_port_status() > |->if(hcd->state == HC_STATE_SUSPENDED) > |->usb_hcd_resume_root_hub() > |->set_bit(HCD_FLAG_WAKEUP_PENDING, &hcd->flags) /* wakeup pending signal to be set */ > |->queue_work(pm_wq, &hcd->wakeup_work) > |->hcd_resume_work() /* hcd->wakeup_work */ > |->usb_remote_wakeup() > |->callback = RPM_GET_CALLBACK(dev, runtime_resume) > |->usb_runtime_resume() /* usb runtime resume */ > |->generic_resume() > |->hcd_bus_resume() > |->clear_bit(HCD_FLAG_WAKEUP_PENDING, &hcd->flags); > /* wakeup pending signal to be clear */ > > 2. However, during the quiesce phase of S4 wakeup, freeze_kernel_threads() will freeze this pm_wq, > and between freeze_kernel_threads() and dpm_suspend_start(), there exists a very time-consuming > S4 image loading process. This leads to a situation where, if an xhci port change event occurs > after freeze_kernel_threads(), triggering the wakeup pending signal to be set,but it cannot > be processed by pm_wq to clear this wakeup_pending bit, it will result in a subsequent > dpm_suspend_start() where USB suspend_common() detects the wakeup pending signal being > set and returns an -EBUSY error, interrupting the S4 quiesce process and reverting to a reboot. > > S4 wakeup > |->resume_store > |->software_resume() > |->freeze_kernel_threads() /* will freeze pm_wq */ > |->load_image_and_restore() > |->swsusp_read() /* S4 image loading: time-consuming . > When an xhci port change event occurs at this point, it triggers the wakeup pending signal to be set. > However, since the pm_wq is in a frozen state, the wakeup_pending bit cannot be cleared.*/ > |->hibernation_restore > |->dpm_suspend_start(PMSG_QUIESCE) > |->hcd_pci_suspend() > |->suspend_common() > |->if (do_wakeup && HCD_WAKEUP_PENDING(hcd)) return -EBUSY; > > Below is a description of the countermeasures taken to address this issue: > 1. Considering the restore process that occurs after the quiesce phase during S4 wakeup, > which essentially resets all root hubs,checking this wakeup pending status in USB suspend_common() > during the quiesce phase is of little significance and should therefore be filtered out. > > S4 wakeup restore phase > |->dpm_resume(PMSG_RESTORE) > |->hcd_pci_restore() > |->xhci_resume() /* reset all root hubs */ > > Fixes: 3904bdf0821c ("PM: hibernate: Freeze kernel threads in software_resume()") > Signed-off-by: dengjie <dengjie03@xxxxxxxxxx> What happened to the other authorship information? And again, please use your name, not an email alias. thanks, greg k-h