From: Bjorn Helgaas > Sent: 16 October 2017 23:07 ... > I don't know how to tell what Windows is doing with respect to AER. Just hopeful someone might :-) > > I've 'bodged' the Linux kernel to think that the BIOS gave it control > > of AER (set OSC_PCI_EXPRESS_AER_CONTROL into *mask and > > root->osc_control_set in acpi_pci_osc_control_set()). > > I think this is the earliest place the info is saved. > > This is enough to the 'pcieport ... AER enabled with IRQ nn' messages > > (It is sharing the interrupt with PME). > > > > I've made sure my card is beneath one of the cpu bridges (the companion > > chip host bridges don't support AER). > > I've also bodged the driver to ioremap() an area larger than one of the > > BARs so I can generated PCIe read and write TLP that are outside the > > BAR ranges. > > Reads set CESta: NonFatalError. > > Writes set UESTA: UnsepReq and save the TLP header. > > No interrupts to aerdrv are generated. > > I can clear the status bits using setpci. > > Should I expect these errors to raise interrupts? > > I think that depends on the Root Error Command register. AFAICT that is set to 7 (all interrupts enabled), nothing gets set in the 'pending' word that follows. If I unmask NonFatalErr from the card's CEMsk read errors also set UnsupReq and save the TLP header. Unfortunately we don't have a PCIe analyser (too expensive), so I can't see any TLP generated by the low level hardware. (I can see all the read/write/completions that match the BARs.) I've also looked as the lspci -vvnnxxxx output from one of our Dell server systems (I've not got one to play with). They have the Root Error Command register set to zero. The RootCtl register (in the main root port capabilities) has both ErrNon-Fatal and ErrFatal set. I think this means that the errors I'm generating would set CERR and probably generate an NMI! It is likely to explain why taking down the PCIe physical layer generates an NMI even after 'echo 1 > xxx/remove'. Maybe the kernel should be unsetting these bits when a card is removed and restoring them after a rescan? David