Bjorn, On Sun, Oct 4, 2020 at 12:44 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > On Sat, Oct 03, 2020 at 03:55:13AM -0400, Ethan Zhao wrote: > > When uncorrectable error happens, AER driver and DPC driver interrupt > > handlers likely call > > > > pcie_do_recovery() > > ->pci_walk_bus() > > ->report_frozen_detected() > > > > with pci_channel_io_frozen the same time. > > If pci_dev_set_io_state() return true even if the original state is > > pci_channel_io_frozen, that will cause AER or DPC handler re-enter > > the error detecting and recovery procedure one after another. > > The result is the recovery flow mixed between AER and DPC. > > So simplify the pci_dev_set_io_state() function to only return true > > when dev->error_state is really changed. > > > > Signed-off-by: Ethan Zhao <haifeng.zhao@xxxxxxxxx> > > Tested-by: Wen Jin <wen.jin@xxxxxxxxx> > > Tested-by: Shanshan Zhang <ShanshanX.Zhang@xxxxxxxxx> > > Reviewed-by: Alexandru Gagniuc <mr.nuke.me@xxxxxxxxx> > > Reviewed-by: Andy Shevchenko <andy.shevchenko@xxxxxxxxx> > > --- > > Changnes: > > v2: revise description and code according to suggestion from Andy. > > v3: change code to simpler. > > v4: no change. > > v5: no change. > > v6: no change. > > v7: changed based on Bjorn's code and truth table. > > > > drivers/pci/pci.h | 53 ++++++++++++++++++----------------------------- > > 1 file changed, 20 insertions(+), 33 deletions(-) > > > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > > index 455b32187abd..47af1ff2a286 100644 > > --- a/drivers/pci/pci.h > > +++ b/drivers/pci/pci.h > > @@ -354,44 +354,31 @@ struct pci_sriov { > > * > > * Must be called with device_lock held. > > * > > - * Returns true if state has been changed to the requested state. > > + * Returns true if state has been really changed to the requested state. > > */ > > static inline bool pci_dev_set_io_state(struct pci_dev *dev, > > pci_channel_state_t new) > > { > > - bool changed = false; > > - > > device_lock_assert(&dev->dev); > > - switch (new) { > > - case pci_channel_io_perm_failure: > > - switch (dev->error_state) { > > - case pci_channel_io_frozen: > > - case pci_channel_io_normal: > > - case pci_channel_io_perm_failure: > > - changed = true; > > - break; > > - } > > - break; > > - case pci_channel_io_frozen: > > - switch (dev->error_state) { > > - case pci_channel_io_frozen: > > - case pci_channel_io_normal: > > - changed = true; > > - break; > > - } > > - break; > > - case pci_channel_io_normal: > > - switch (dev->error_state) { > > - case pci_channel_io_frozen: > > - case pci_channel_io_normal: > > - changed = true; > > - break; > > - } > > - break; > > - } > > - if (changed) > > - dev->error_state = new; > > - return changed; > > + > > +/* > > + * Truth table: > > + * requested new state > > + * current ------------------------------------------ > > + * state normal frozen perm_failure > > + * ------------ + ------------- ------------- ------------ > > + * normal | normal frozen perm_failure > > + * frozen | normal frozen perm_failure > > + * perm_failure | perm_failure* perm_failure* perm_failure > > + */ > > + > > + if (dev->error_state == pci_channel_io_perm_failure) > > + return false; > > + else if (dev->error_state == new) > > + return false; > > + > > + dev->error_state = new; > > + return true; > > No, you missed the point. I want > > 1) One patch that converts the "switch" to the shorter "if" > statements. This one will be big and ugly, but should not change > the functionality at all, and it should be pretty easy to verify > that since there aren't very many states involved. > > Since this one is pure code simplification, the commit log won't > say anything at all about AER or DPC or their requirements > because it's not changing any behavior. > > 2) A separate patch that's tiny and makes whatever functional change > you need. Make sense, clear, this time. Thanks, Ethan > > > } > > > > static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused) > > -- > > 2.18.4 > >