Hi Ilpo, On Tue, Feb 4, 2025 at 8:48 AM Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> wrote: > > On Mon, 3 Feb 2025, Rafael J. Wysocki wrote: > > > On Mon, Feb 3, 2025 at 9:12 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote: > > > > > > Hi, > > Hi Rafael, > > > > The following commit: > > > > > > commit 1db806ec06b7c6e08e8af57088da067963ddf117 > > > Author: Jian-Hong Pan <jhp@xxxxxxxxxxxxx> > > > Date: Fri Nov 15 15:22:02 2024 +0800 > > > > > > PCI/ASPM: Save parent L1SS config in pci_save_aspm_l1ss_state() > > > > > > After 17423360a27a ("PCI/ASPM: Save L1 PM Substates Capability for > > > suspend/resume"), pci_save_aspm_l1ss_state(dev) saves the L1SS state for > > > "dev", and pci_restore_aspm_l1ss_state(dev) restores the state for both > > > "dev" and its parent. > > > > > > The problem is that unless pci_save_state() has been used in some other > > > path and has already saved the parent L1SS state, we will restore junk to > > > the parent, which means the L1 Substates likely won't work correctly. > > > > > > Save the L1SS config for both the device and its parent in > > > pci_save_aspm_l1ss_state(). When restoring, we need both because L1SS must > > > be enabled at the parent (the Downstream Port) before being enabled at the > > > child (the Upstream Port). > > > > > > Link: https://lore.kernel.org/r/20241115072200.37509-3-jhp@xxxxxxxxxxxxx > > > Fixes: 17423360a27a ("PCI/ASPM: Save L1 PM Substates Capability for > > > suspend/resume") > > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218394 > > > Suggested-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> > > > Signed-off-by: Jian-Hong Pan <jhp@xxxxxxxxxxxxx> > > > [bhelgaas: parallel save/restore structure, simplify commit log, patch at > > > https://lore.kernel.org/r/20241212230340.GA3267194@bhelgaas] > > > Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > > > Tested-by: Jian-Hong Pan <jhp@xxxxxxxxxxxxx> # Asus B1400CEAE > > > > > > broke system suspend/resume on my Dell XPS13 9360. It doesn't even > > > pass suspend/resume testing after "echo devices > /sys/power/pm_test". > > > > > > It looks like PCIe links are all down during resume after the above > > > commit, but it is rather hard to collect any data in that state. > > > > > > Reverting the above commit on top of 6.14-rc1 makes things work again, > > > no problem. > > > > > > I'm unsure what exactly the problem is ATM, but I'm going to check a > > > couple of theories. > > > > The attached change makes it work again, FWIW, but moving the > > parent->l1ss check alone below the pdev l1ss saving doesn't help. > > > > So it is either the parent check against NULL or the > > pcie_downstream_port(pdev) one that breaks it. I guess the former, > > but I'll test it tomorrow. > > Neither of those is the root cause Well, not quite. > but it's bit hard to see from the code > itself because the parent->saved_state check your test patch also removed My patch hasn't removed that check. Besides, suspend/resume works on my system without commit 1db806ec06b7c6e0 and the parent->saved_state only affects the parent, so it clearly cannot be the culprit here. > looks very logical on a glance (but that's the problematic line). > > The fix is already here with the explanation: > > https://lore.kernel.org/linux-pci/20250131152913.2507-1-ilpo.jarvinen@xxxxxxxxxxxxxxx/T/#u So it turns out that the minimum fux that works here is what I posted. That is, the upfront pcie_downstream_port(pdev) check needs to be dropped and the !parent check needs to be moved after saving the pdev's state. IOW, it looks like on this platform, it is necessary to save the l1ss state for a Root Port.