On Tue, Jun 11, 2019 at 10:39 AM Kai-Heng Feng <kai.heng.feng@xxxxxxxxxxxxx> wrote: > > Hi Rafael, > > at 19:02, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote: > > > On Friday, May 17, 2019 11:08:50 AM CEST Rafael J. Wysocki wrote: > >> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > >> > >> If a PCI driver leaves the device handled by it in D0 and calls > >> pci_save_state() on the device in its ->suspend() or ->suspend_late() > >> callback, it can expect the device to stay in D0 over the whole > >> s2idle cycle. However, that may not be the case if there is a > >> spurious wakeup while the system is suspended, because in that case > >> pci_pm_suspend_noirq() will run again after pci_pm_resume_noirq() > >> which calls pci_restore_state(), via pci_pm_default_resume_early(), > >> so state_saved is cleared and the second iteration of > >> pci_pm_suspend_noirq() will invoke pci_prepare_to_sleep() which > >> may change the power state of the device. > >> > >> To avoid that, add a new internal flag, skip_bus_pm, that will be set > >> by pci_pm_suspend_noirq() when it runs for the first time during the > >> given system suspend-resume cycle if the state of the device has > >> been saved already and the device is still in D0. Setting that flag > >> will cause the next iterations of pci_pm_suspend_noirq() to set > >> state_saved for pci_pm_resume_noirq(), so that it always restores the > >> device state from the originally saved data, and avoid calling > >> pci_prepare_to_sleep() for the device. > >> > >> Fixes: 33e4f80ee69b ("ACPI / PM: Ignore spurious SCI wakeups from > >> suspend-to-idle") > >> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > > I just found out this patch has a chance to freeze or reboot the system > during suspend cycles. > What information do you need to debug? It would be good to narrow down the failure to a particular transition, for example. In particular, if that happens during the dpm_noirq_resume_devices() called from s2idle_loop(), it may be necessary to also skip pci_pm_default_resume_early() for the devices with skip_bus_pm set. How many devices on the affected system end up with skip_bus_pm set, for that matter?