On Mon, Aug 21, 2023 at 9:20 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > Wow, this is super interesting. e8b908146d44 literally just increases > a timeout; the complete patch is: > > static void pci_pm_bridge_power_up_actions(struct pci_dev *pci_dev) > { > - pci_bridge_wait_for_secondary_bus(pci_dev, "resume", PCI_RESET_WAIT); > + pci_bridge_wait_for_secondary_bus(pci_dev, "resume", > + PCIE_RESET_READY_POLL_MS); > > Increasing a timeout should never cause a failure like this, so > there must be something really unexpected going on. Hello Bjorn, thanks for a quick response. Your reply helped me discover that the laptop doesn't really *fail* to resume, it just makes the resume much *longer*. I just never waited that long. PCI_RESET_WAIT is 1 second, PCIE_RESET_READY_POLL_MS is 60 seconds. If I wait long enough, the laptop finally resumes correctly after roughly 70 seconds (before the patch the resume took roughly 5 seconds). Sorry for not spotting that earlier! I also tested this with the current git master tip (commit f7757129e3de). Without any adjustments, the resume delay is roughly 70 seconds. But if I change PCIE_RESET_READY_POLL_MS from 60 seconds to 2 seconds and recompile it, the resume delay is roughly 6 seconds. With the latest kernel f7757129e3de, here are some debugging logs: * dmesg collected after delayed resume (extra 60 seconds): https://bugzilla-attachments.redhat.com/attachment.cgi?id=1984636 * system journal after delayed resume: https://bugzilla-attachments.redhat.com/attachment.cgi?id=1984637 * lspci -vv before suspend: https://bugzilla-attachments.redhat.com/attachment.cgi?id=1984638 * lspci -vv after delayed resume: https://bugzilla-attachments.redhat.com/attachment.cgi?id=1984639 > Would you mind > collecting the output of "sudo lspci -vv" both with and without > "pcie_aspm=off"? No need to try suspend/resume to collect these. > > Also, what does this race condition look like? Dock devices are > visible before suspend, but sometimes none of them are visible *after* > resume? We don't re-enumerate on resume, so does this mean they still > appear in lspci output but they just don't work? I didn't manage to debug this today. Given the newly discovered circumstances described above, I wonder whether your request still applies. If it does, I can provide it tomorrow. Thanks for looking into this, Kamil Páral