When uncorrectable error happens, AER driver and DPC driver interrupt handlers likely call pcie_do_recovery() ->pci_walk_bus() ->report_frozen_detected() with pci_channel_io_frozen the same time. If pci_dev_set_io_state() return true even if the original state is pci_channel_io_frozen, that will cause AER or DPC handler re-enter the error detecting and recovery procedure one after another. The result is the recovery flow mixed between AER and DPC. So simplify the pci_dev_set_io_state() function to only return true when dev->error_state is really changed. Signed-off-by: Ethan Zhao <haifeng.zhao@xxxxxxxxx> Tested-by: Wen Jin <wen.jin@xxxxxxxxx> Tested-by: Shanshan Zhang <ShanshanX.Zhang@xxxxxxxxx> Reviewed-by: Alexandru Gagniuc <mr.nuke.me@xxxxxxxxx> Reviewed-by: Andy Shevchenko <andy.shevchenko@xxxxxxxxx> --- Changnes: v2: revise description and code according to suggestion from Andy. v3: change code to simpler. v4: no change. v5: no change. v6: no change. v7: changed based on Bjorn's code and truth table. drivers/pci/pci.h | 53 ++++++++++++++++++----------------------------- 1 file changed, 20 insertions(+), 33 deletions(-) diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 455b32187abd..47af1ff2a286 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -354,44 +354,31 @@ struct pci_sriov { * * Must be called with device_lock held. * - * Returns true if state has been changed to the requested state. + * Returns true if state has been really changed to the requested state. */ static inline bool pci_dev_set_io_state(struct pci_dev *dev, pci_channel_state_t new) { - bool changed = false; - device_lock_assert(&dev->dev); - switch (new) { - case pci_channel_io_perm_failure: - switch (dev->error_state) { - case pci_channel_io_frozen: - case pci_channel_io_normal: - case pci_channel_io_perm_failure: - changed = true; - break; - } - break; - case pci_channel_io_frozen: - switch (dev->error_state) { - case pci_channel_io_frozen: - case pci_channel_io_normal: - changed = true; - break; - } - break; - case pci_channel_io_normal: - switch (dev->error_state) { - case pci_channel_io_frozen: - case pci_channel_io_normal: - changed = true; - break; - } - break; - } - if (changed) - dev->error_state = new; - return changed; + +/* + * Truth table: + * requested new state + * current ------------------------------------------ + * state normal frozen perm_failure + * ------------ + ------------- ------------- ------------ + * normal | normal frozen perm_failure + * frozen | normal frozen perm_failure + * perm_failure | perm_failure* perm_failure* perm_failure + */ + + if (dev->error_state == pci_channel_io_perm_failure) + return false; + else if (dev->error_state == new) + return false; + + dev->error_state = new; + return true; } static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused) -- 2.18.4