On Fri, Jul 16, 2021 at 02:25:04PM +0200, Pali Rohár wrote: > commit f18139966d072dab8e4398c95ce955a9742e04f7 upstream. > > Trying to start a new PIO transfer by writing value 0 in PIO_START register > when previous transfer has not yet completed (which is indicated by value 1 > in PIO_START) causes an External Abort on CPU, which results in kernel > panic: > > SError Interrupt on CPU0, code 0xbf000002 -- SError > Kernel panic - not syncing: Asynchronous SError Interrupt > > To prevent kernel panic, it is required to reject a new PIO transfer when > previous one has not finished yet. > > If previous PIO transfer is not finished yet, the kernel may issue a new > PIO request only if the previous PIO transfer timed out. > > In the past the root cause of this issue was incorrectly identified (as it > often happens during link retraining or after link down event) and special > hack was implemented in Trusted Firmware to catch all SError events in EL3, > to ignore errors with code 0xbf000002 and not forwarding any other errors > to kernel and instead throw panic from EL3 Trusted Firmware handler. > > Links to discussion and patches about this issue: > https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/?id=3c7dcdac5c50 > https://lore.kernel.org/linux-pci/20190316161243.29517-1-repk@xxxxxxxxxxxx/ > https://lore.kernel.org/linux-pci/971be151d24312cc533989a64bd454b4@xxxxxxxxxxx/ > https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/1541 > > But the real cause was the fact that during link retraining or after link > down event the PIO transfer may take longer time, up to the 1.44s until it > times out. This increased probability that a new PIO transfer would be > issued by kernel while previous one has not finished yet. > > After applying this change into the kernel, it is possible to revert the > mentioned TF-A hack and SError events do not have to be caught in TF-A EL3. > > Link: https://lore.kernel.org/r/20210608203655.31228-1-pali@xxxxxxxxxx > Signed-off-by: Pali Rohár <pali@xxxxxxxxxx> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> > Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > Reviewed-by: Marek Behún <kabel@xxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx # 7fbcb5da811b ("PCI: aardvark: Don't rely on jiffies while holding spinlock") > [pali: Backported to 4.19 version] > --- > This patch is backported to 4.19 version. It depends on commit > 7fbcb5da811b as presented on Cc: stable line. > --- > drivers/pci/controller/pci-aardvark.c | 49 ++++++++++++++++++++++----- > 1 file changed, 40 insertions(+), 9 deletions(-) Now queued up, thanks. greg k-h