I'm (still) trying to provoke aerdrv to report a PCIe error. I've found a system where a 4.13 kernel (ubuntu 17.0) reports '_OSC: OS now control [... AER ...]' and aerdrv (and PCIe PME) use interrupts 125 and 126. One of those host ports is connected to an igb ethernet chip, the other to one of our cards. I can generate PCIe read or write cycles that are outside the BARs - and the card duly fills in data into own AER registers. But I'm not seeing any messages from aerdrv or even the interrupt count going up. CPU is an i7-7700 (according to /proc/cpuinfo). The root port's (Intel Sunrise Point-H) AER registers are: Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 RootCmd: CERptEn+ NFERptEn+ FERptEn+ RootSta: CERcvd- MultCERcvd- UERcvd- MultUERcvd- FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0 ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000 Our card's are: Capabilities: [800 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 14, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 40000001 0000000f df202000 00000000 The HeaderLog is that of a write beyond the end of BAR2. If I unmask NonFatalErr the HeaderLog contains that of invalid reads. (Writing to the status bits clears them.) I think the endpoint should be sending some kind of TLP to the root that would update the status registers and then interrupt the host. Unfortunately I can only trace TLP that match the BARs. I can try taking down the PCIe link - I believe the root port should log that itself? Any ideas? Is it worth me trying to get the igb to generate the same errors? David