On Mon, Nov 02, 2020 at 15:10 Hedi Berriche wrote:
Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()") broke pcie_do_recovery(): updating status after reset_link() has the ill side effect of causing recovery to fail if the error status is PCI_ERS_RESULT_CAN_RECOVER or PCI_ERS_RESULT_NEED_RESET as the following code will *never* run in the case of a successful reset_link() 177 if (status == PCI_ERS_RESULT_CAN_RECOVER) { ... 181 } 183 if (status == PCI_ERS_RESULT_NEED_RESET) { ... 192 } For instance in the case of PCI_ERS_RESULT_NEED_RESET we end up not calling ->slot_reset() (because we skip report_slot_reset()) thus breaking driver (re)initialisation. Don't clobber status with the return value of reset_link(); set status to PCI_ERS_RESULT_RECOVERED, in case of successful link reset, if and only if the initial value of error status is PCI_ERS_RESULT_DISCONNECT or PCI_ERS_RESULT_NO_AER_DRIVER. Fixes: 6d2c89441571 ("PCI/ERR: Update error status after reset_link()") Signed-off-by: Hedi Berriche <hedi.berriche@xxxxxxx> Reviewed-by: Sinan Kaya <okaya@xxxxxxxxxx> Cc: Russ Anderson <rja@xxxxxxx> Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> Cc: Ashok Raj <ashok.raj@xxxxxxxxx> Cc: Joerg Roedel <jroedel@xxxxxxxx> Cc: stable@xxxxxxxxxx # v5.7+ --- drivers/pci/pcie/err.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c index c543f419d8f9..2730826cfd8a 100644 --- a/drivers/pci/pcie/err.c +++ b/drivers/pci/pcie/err.c @@ -165,10 +165,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, pci_dbg(dev, "broadcast error_detected message\n"); if (state == pci_channel_io_frozen) { pci_walk_bus(bus, report_frozen_detected, &status); - status = reset_link(dev); - if (status != PCI_ERS_RESULT_RECOVERED) { + if (reset_link(dev) != PCI_ERS_RESULT_RECOVERED) { pci_warn(dev, "link reset failed\n"); goto failed; + } else { + if (status == PCI_ERS_RESULT_DISCONNECT || + status == PCI_ERS_RESULT_NO_AER_DRIVER) + status = PCI_ERS_RESULT_RECOVERED; } } else { pci_walk_bus(bus, report_normal_detected, &status); -- 2.28.0
Bjorn, Sorry to bug you, but could you please cast your eyes on this patch and let me know whether you have any concerns that might be barring it from inclusion. Cheers, Hedi. -- Be careful of reading health books, you might die of a misprint. -- Mark Twain