[+cc Rafael, linux-pm because they *are* PM experts :)] On Wed, Jan 08, 2025 at 02:15:28AM +0200, Sergey Ryazanov wrote: > On 08.01.2025 01:45, Bjorn Helgaas wrote: > > On Wed, Jan 08, 2025 at 01:13:41AM +0200, Sergey Ryazanov wrote: > > > On 05.01.2025 19:39, Maciej S. Szmigiero wrote: > > > > Currently, the driver is seriously broken with respect to the > > > > hibernation (S4): after image restore the device is back into > > > > IPC_MEM_EXEC_STAGE_BOOT (which AFAIK means bootloader stage) and needs > > > > full re-launch of the rest of its firmware, but the driver restore > > > > handler treats the device as merely sleeping and just sends it a > > > > wake-up command. > > > > > > > > This wake-up command times out but device nodes (/dev/wwan*) remain > > > > accessible. > > > > However attempting to use them causes the bootloader to crash and > > > > enter IPC_MEM_EXEC_STAGE_CD_READY stage (which apparently means "a crash > > > > dump is ready"). > > > > > > > > It seems that the device cannot be re-initialized from this crashed > > > > stage without toggling some reset pin (on my test platform that's > > > > apparently what the device _RST ACPI method does). > > > > > > > > While it would theoretically be possible to rewrite the driver to tear > > > > down the whole MUX / IPC layers on hibernation (so the bootloader does > > > > not crash from improper access) and then re-launch the device on > > > > restore this would require significant refactoring of the driver > > > > (believe me, I've tried), since there are quite a few assumptions > > > > hard-coded in the driver about the device never being partially > > > > de-initialized (like channels other than devlink cannot be closed, > > > > for example). > > > > Probably this would also need some programming guide for this hardware. > > > > > > > > Considering that the driver seems orphaned [1] and other people are > > > > hitting this issue too [2] fix it by simply unbinding the PCI driver > > > > before hibernation and re-binding it after restore, much like > > > > USB_QUIRK_RESET_RESUME does for USB devices that exhibit a similar > > > > problem. > > > > > > > > Tested on XMM7360 in HP EliteBook 855 G7 both with s2idle (which uses > > > > the existing suspend / resume handlers) and S4 (which uses the new code). > > > > > > > > [1]: https://lore.kernel.org/all/c248f0b4-2114-4c61-905f-466a786bdebb@xxxxxxxxxxxxx/ > > > > [2]: > > > > https://github.com/xmm7360/xmm7360-pci/issues/211#issuecomment-1804139413 > > > > > > > > Signed-off-by: Maciej S. Szmigiero <mail@xxxxxxxxxxxxxxxxxxxxx> > > > > > > Generally looks good to me. Lets wait for approval from PCI > > > maintainers to be sure that there no unexpected side effects. > > > > I have nothing useful to contribute here. Seems like kind of a > > mess. But Intel claims to maintain this, so it would be nice if > > they would step up and make this work nicely. > > Suddenly, Intel lost their interest in the modems market and, as > Maciej mentioned, the driver was abandon for a quite time now. The > author no more works for Intel. You will see the bounce. Well, that's unfortunate :) Maybe step 0 is to remove the Intel entry from MAINTAINERS for this driver. > Bjorn, could you suggest how to deal easily with the device that is > incapable to seamlessly recover from hibernation? I am totally > hopeless regarding the PM topic. Or is the deep driver rework the > only option? I'm pretty PM-illiterate myself. Based on https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/pm/sleep-states.rst?id=v6.12#n109, I assume that when we resume after hibernate, devices are in the same state as after a fresh boot, i.e., the state driver .probe() methods see. So I assume that some combination of dev_pm_ops methods must be able to do basically the same as .probe() to get the device usable again after it was completely powered off and back on. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/driver-api/pm/devices.rst?id=v6.12#n506 mentions .freeze(), .thaw(), .restore(), etc, but the fact that few drivers set those pointers and all the nice macros for setting pm ops (SYSTEM_SLEEP_PM_OPS, NOIRQ_SYSTEM_SLEEP_PM_OPS, etc) only take suspend and resume functions makes me think most drivers must handle hibernation in the same .suspend() and .resume() functions they use for non-hibernate transitions. Since all drivers have to cope with devices needing to be reinitialized after hibernate, I would look around to see how other drivers do it and see if you can do it similarly. Sorry this is still really a non-answer. Bjorn