> From: Bjorn Helgaas <helgaas@xxxxxxxxxx> > Sent: Tuesday, October 8, 2019 12:56 PM > ... > Wordsmithing nit: what the patch does is not "fix the error message"; > what it does is fix the *problem*, i.e., the fact that we can't > operate the device because we can't enable MSI-X. The message is only > a symptom. I totally agree. :-) > IIUC the relevant part of the system hibernation sequence is: > > pci_pm_freeze_noirq > pci_pm_thaw_noirq > pci_pm_thaw > > And the execution flow is: > > pci_pm_freeze_noirq > if (pci_has_legacy_pm_support(pci_dev)) # true for mlx4 > pci_legacy_suspend_late(dev, PMSG_FREEZE) > pci_pm_set_unknown_state > dev->current_state = PCI_UNKNOWN # <--- > pci_pm_thaw_noirq > if (pci_has_legacy_pm_support(pci_dev)) # true > pci_legacy_resume_early(dev) # noop; mlx4 doesn't > implement > pci_pm_thaw # returns -95 > EOPNOTSUPP > if (pci_has_legacy_pm_support(pci_dev)) # true > pci_legacy_resume > drv->resume > mlx4_resume # mlx4_driver.resume (legacy) > mlx4_load_one > mlx4_enable_msi_x > pci_enable_msix_range > __pci_enable_msix_range > __pci_enable_msix > if (!pci_msi_supported()) > if (dev->current_state != PCI_D0) # <--- > return 0 > return -EINVAL > err = -EOPNOTSUPP > "INTx is not supported ..." > > (These are just my notes; you don't need to put them all into the > commit message. I'm just sharing them in case I'm not understanding > correctly.) Yes, these notes are accurate. > > > > > When the system starts again, a fresh kernel starts to run, and when the > > > > > kernel detects that a hibernation image was saved, the kernel > "quiesces" > > > > > the devices, and then "restores" the devices from the saved image. In > this > > > > > path: > > > > > device_resume_noirq() -> ... -> > > > > > pci_pm_restore_noirq() -> > > > > > pci_pm_default_resume_early() -> > > > > > pci_power_up() moves the device states back to PCI_D0. This > path is > > > > > not broken and doesn't need my patch. > > > > > > > The cc list suggests that this might be a fix for a user-reported > problem. Is there a launchpad or similar link you could include here? I guess I'm the first one to notice the issue and there is not any bug link AFAIK. The hibernation process usually saves the states into a local disk (before the system is powered off), and the Mellanox NIC is not needed during the process, so it's not a real issue that the NIC can not work between pci_pm_thaw() and power_down(). This may explain why nobody else noticed the issue. I happened to see the error message, and hence investigated the issue. > Should this be marked for stable? I think we should do it. > > > > > --- a/drivers/pci/pci-driver.c > > > > > +++ b/drivers/pci/pci-driver.c > > > > > @@ -1074,15 +1074,16 @@ static int pci_pm_thaw_noirq(struct device > > > > *dev) > > > > > return error; > > > > > } > > > > > > > > > > - if (pci_has_legacy_pm_support(pci_dev)) > > > > > - return pci_legacy_resume_early(dev); > > > > > - > > > > > /* > > > > > * pci_restore_state() requires the device to be in D0 (because > of MSI > > > > > * restoration among other things), so force it into D0 in case > the > > > > > * driver's "freeze" callbacks put it into a low-power state > directly. > > > > > */ > > > > > pci_set_power_state(pci_dev, PCI_D0); > > > > > + > > > > > + if (pci_has_legacy_pm_support(pci_dev)) > > > > > + return pci_legacy_resume_early(dev); > > > > > + > > > > > pci_restore_state(pci_dev); > > > > > > > > > > if (drv && drv->pm && drv->pm->thaw_noirq) > > > > > -- > > > > > 2.19.1 > > > > > > > The patch looks reasonable to me, but the comment above the > > pci_set_power_state() call needs to be updated too IMO. > > Hmm. > > 1) pci_restore_state() mainly writes config space, which doesn't > require the device to be in D0. The only thing I see that would > require D0 is the MSI-X MMIO space, so to be more specific, the > comment could say "restoring the MSI-X *MMIO* state requires the > device to be in D0". > > But I think you meant some other comment change. Did you mean > something along the lines of "a legacy drv->resume_early() callback > and pci_restore_state() both require the device to be in D0"? > > If something else, maybe you could propose some text? > > 2) I assume pci_pm_thaw_noirq() should leave the device in a > functionally equivalent state, whether it uses legacy PM or not. Do > we want something like the patch below instead? If we *do* want to > skip pci_restore_state() for legacy PM, maybe we should add a comment. > > 3) Documentation/power/pci.rst says: > > ... devices have to be brought back to the fully functional > state ... > > pci_pm_thaw_noirq() ... doesn't put the device into the full power > state and doesn't attempt to restore its standard configuration > registers. > > That doesn't seem consistent, and it looks like pci_pm_thaw_noirq() > actually *does* put the device in full power (D0) state and restore > config registers. I would leave these questions to Rafael. > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c > index a8124e47bf6e..30c721fd6bcf 100644 > --- a/drivers/pci/pci-driver.c > +++ b/drivers/pci/pci-driver.c > @@ -1068,7 +1068,7 @@ static int pci_pm_thaw_noirq(struct device *dev) > { > struct pci_dev *pci_dev = to_pci_dev(dev); > struct device_driver *drv = dev->driver; > - int error = 0; > + int error; > > if (pcibios_pm_ops.thaw_noirq) { > error = pcibios_pm_ops.thaw_noirq(dev); > @@ -1076,9 +1076,6 @@ static int pci_pm_thaw_noirq(struct device *dev) > return error; > } > > - if (pci_has_legacy_pm_support(pci_dev)) > - return pci_legacy_resume_early(dev); > - > /* > * pci_restore_state() requires the device to be in D0 (because of MSI > * restoration among other things), so force it into D0 in case the > @@ -1087,10 +1084,13 @@ static int pci_pm_thaw_noirq(struct device *dev) > pci_set_power_state(pci_dev, PCI_D0); > pci_restore_state(pci_dev); > > + if (pci_has_legacy_pm_support(pci_dev)) > + return pci_legacy_resume_early(dev); > + > if (drv && drv->pm && drv->pm->thaw_noirq) > - error = drv->pm->thaw_noirq(dev); > + return drv->pm->thaw_noirq(dev); > > - return error; > + return 0; > } > > static int pci_pm_thaw(struct device *dev) The only real difference from my patch is that you moved + if (pci_has_legacy_pm_support(pci_dev)) + return pci_legacy_resume_early(dev); to after the line "pci_restore_state(pci_dev);" This change is good to me, and shoud also resolve the error message I saw. Thanks, -- Dexuan