On Thu, Aug 16, 2018 at 05:05:30PM +1000, Benjamin Herrenschmidt wrote: > On Thu, 2018-08-16 at 16:51 +1000, Benjamin Herrenschmidt wrote: > > No, this is wrong and not the intent of the error handling. > > > > You seem to be applying PCIe specific concepts brain-farted at Intel > > that are way way away from what we care about in practice and in Linux. > > > > > e.g. some driver handle errors ERR_NONFATAL or FATAL in similar ways > > > e.g. > > > ioat_pcie_error_detected(); calls ioat_shutdown(); in case of > > > ERR_NONFATAL > > > otherwise ioat_shutdown() in case of ERR_FATAL. > > > > Since when the error handling callbacks even have the concept of FATAL > > vs. non-fatal ? This doesn't appear anyhwhere in the prototype of the > > struct pci_error_handlers and shouldn't. > > Ugh... I just saw the changes you did to Documentation/PCI/pci-error- > recovery.txt and I would very much like to revert those ! > > Bjorn, you shouldn't let changes to the PCI error handling through > without acks from us, it looks like we didn't notice (we can't possibly > monitor all lists). Sorry, they were certainly very visible on linux-pci, but I should have added you to cc if you weren't there already. Please update MAINTAINERS if it's incomplete or out of date (I can't possibly know who might be interested in every change). > Bjorn, please revert all of those changes. Please send the appropriate patches and we'll go from there.