https://bugzilla.kernel.org/show_bug.cgi?id=202055 --- Comment #20 from Alex Williamson (alex.williamson@xxxxxxxxxx) --- Hi Dongli, (In reply to Dongli Zhang from comment #19) > > The kernel I use is the most recent upstream version including commit > aa667c6408d20a84c7637420bc3b7aa0abab59a2. > > Is there a way to know if IDT switch is in the topology? No IDT switch in this system, so you shouldn't have that issue. > The env is an dell desktop I use at home to debug program myself. > > # lspci > 00:00.0 Host bridge: Intel Corporation Device 591f (rev 05) > 00:02.0 VGA compatible controller: Intel Corporation Device 5912 (rev 04) > 00:14.0 USB controller: Intel Corporation Device a2af > 00:14.2 Signal processing controller: Intel Corporation Device a2b1 > 00:16.0 Communication controller: Intel Corporation Device a2ba > 00:17.0 SATA controller: Intel Corporation Device a282 > 00:1b.0 PCI bridge: Intel Corporation Device a2e7 (rev f0) > 00:1d.0 PCI bridge: Intel Corporation Device a298 (rev f0) > 00:1f.0 ISA bridge: Intel Corporation Device a2c6 > 00:1f.2 Memory controller: Intel Corporation Device a2a1 > 00:1f.3 Audio device: Intel Corporation Device a2f0 > 00:1f.4 SMBus: Intel Corporation Device a2a3 > 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (5) I219-V > 01:00.0 Non-Volatile memory controller: Intel Corporation Device f1a6 (rev > 03) > 02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network > Connection (rev 01) > 02:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network > Connection (rev 01) I bought an ADATA XPG SX8200 drive to debug further, in some systems it works fine with the attached patch, but in another I think I'm getting something similar to what you see. My system has Downstream Port Containment (DPC) support, so I think that catches the error before AER, but if I disable ACS Source Validation on the root port it avoids any errors, so I think we're still dealing with the ACS violation that you see. A clue though is that triggering the bus reset via setpci as in comment 10 does not trigger the fault. I then stumbled on adding a delay in the kernel code path prior to the bus reset to avoid the issue. Long story short, could you try adding a delay to the previous patch, for example make the new function in drivers/pci/quirks.c look like this: static int prefer_bus_reset(struct pci_dev *dev, int probe) { msleep(100); return pci_parent_bus_reset(dev, probe); } I look forward to seeing if this works around the AER fault in your system as well. -- You are receiving this mail because: You are watching the assignee of the bug.