From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> Commit bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") uses reset_link() to recover from fatal errors. But during fatal error recovery, if the initial value of error status is PCI_ERS_RESULT_DISCONNECT or PCI_ERS_RESULT_NO_AER_DRIVER then even after successful recovery (using reset_link()) pcie_do_recovery() will report the recovery result as failure. So update the status of error after reset_link(). You can reproduce this issue by triggering a SW DPC using "DPC Software Trigger" bit in "DPC Control Register". You should see recovery failed dmesg log as below. pcieport 0000:00:16.0: DPC: containment event, status:0x1f27 source:0x0000 pcieport 0000:00:16.0: DPC: software trigger detected pci 0000:04:00.0: AER: can't recover (no error_detected callback) pcieport 0000:00:16.0: AER: device recovery failed Fixes: bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") Link: https://lore.kernel.org/r/15e702a33cc27314f9d43a06ccb408086a229cef.1583286655.git.sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> Acked-by: Keith Busch <keith.busch@xxxxxxxxx> Cc: Ashok Raj <ashok.raj@xxxxxxxxx> --- drivers/pci/pcie/err.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c index 01dfc8bb7ca0..1ac57e9e1e71 100644 --- a/drivers/pci/pcie/err.c +++ b/drivers/pci/pcie/err.c @@ -203,14 +203,14 @@ void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state, bus = dev->subordinate; pci_dbg(dev, "broadcast error_detected message\n"); - if (state == pci_channel_io_frozen) + if (state == pci_channel_io_frozen) { pci_walk_bus(bus, report_frozen_detected, &status); - else + status = reset_link(dev, service); + if (status != PCI_ERS_RESULT_RECOVERED) + goto failed; + } else { pci_walk_bus(bus, report_normal_detected, &status); - - if (state == pci_channel_io_frozen && - reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED) - goto failed; + } if (status == PCI_ERS_RESULT_CAN_RECOVER) { status = PCI_ERS_RESULT_RECOVERED; -- 2.17.1