On Wednesday 05 May 2021 15:20:11 David Laight wrote: > From: Pali Rohár > > Sent: 05 May 2021 14:03 > ... > > I already figured out that CPU receive external abort also when trying > > to issue a new PIO transfer for accessing PCI config space while > > previous transfer has not finished yet. And also there is no way (at > > least in documentation) which allows to "mask" this external abort. But > > this issue can be fixed in pci-aardvark.c driver to disallow access to > > config space while previous transfer is still running (I will send patch > > for this one). > > My the sound of the above you need to put a global spinlock around > all PCIe config space accesses. Kernel already uses raw_spin_lock_irqsave(), see pci_lock_config() macro in pci/access.c which implements this global lock for config space access. But issue is that pci-driver.c does not wait for finishing transfer and return from function which unlock this spin lock... Week ago I fixed this issue in U-Boot and similar fix would be needed also for kernel https://source.denx.de/u-boot/u-boot/-/commit/eccbd4ad8e4e But this issue is not related to my original report about XHCI & PCI. > Is this the horrid hardware that can't do a 'normal' PCIe transfer > while a config space access is in progress? Issue is different. You cannot do config space PIO transfer while another config space PIO transfer is in progress. > If that it true then you have bigger problems. > Especially if it is an SMP system. I really hope that memory read or write transfer can be initiated while config transfer is in progress. Marvell A3720 platform on which can be found this pci aardvark controller is 2 core CPU SoC. At least I have not seen any abort when PCIe link is up, card connected and previous config access transfer finished. > > So seems that PCIe controller HW triggers these external aborts when > > device on PCIe bus is not accessible anymore. > > > > If this issue is really caused by MMIO access from xhci driver when > > device is not accessible on the bus anymore, can we do something to > > prevent this kernel crash? Somehow mask that external abort in kernel > > for a time during MMIO access? > > If it is a cycle abort then the interrupted address is probably > that of the MMIO instruction. > So you need to catch the abort, emulate the instruction and > then return to the next one. Has kernel API & infrastructure for catching these aborts and executing own driver handler when abort happens? > This probably requires an exception table containing the address > of every readb/w/l() instruction. > > If you get a similar error on writes it is likely to be a few > instructions after the actual writeb/w/l() instruction. > Write are normally 'posted' and asynchronous. > > If you are really lucky you can get enough state out of the > abort handler to fixup/ignore the cycle without an > exception table. > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales)