On 12/1/2015 11:43 PM, Sinan Kaya wrote: > Setting the SERR# forwarding must have made the trick. This part was > just an additional clearing of the errors. > Nope, I was just enabling non-advisory fatal error from the mask register. Not clearing it. > I'll retest without this bit. Here we go. /#lspci 00:00.0 Class 0604: 17cb:0400 01:00.0 Class 0604: 10b5:8732 02:08.0 Class 0604: 10b5:8732 03:00.0 Class 0604: 10b5:8732 04:00.0 Class 0604: 10b5:8732 05:00.0 Class 0604: 10b5:8749 05:00.1 Class 0880: 10b5:87d0 05:00.2 Class 0880: 10b5:87d0 05:00.3 Class 0880: 10b5:87d0 05:00.4 Class 0880: 10b5:87d0 06:08.0 Class 0604: 10b5:8749 06:09.0 Class 0604: 10b5:8749 06:10.0 Class 0604: 10b5:8749 06:11.0 Class 0604: 10b5:8749 06:12.0 Class 0604: 10b5:8749 07:00.0 Class ff00: 1172:e001 This is after removing the PCI_ERR_COR_ADV_NFAT setting which looks much better to me. I'll post a new patch without PCI_ERR_COR_ADV_NFAT. /#[24.358445]pcieport_0006:00:00.0:_AER:_Multiple_Corrected_error_received:_id=0640 [ 24.358559] pcieport 0006:06:08.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=06 [ 24.358571] pcieport 0006:06:08.0: device [10b5:8749] error status/mask=00002081/0000e000 [ 24.358583] pcieport 0006:06:08.0: [ 0] Receiver Error (First) [ 24.358593] pcieport 0006:06:08.0: [ 7] Bad DLLP [ 24.358616] pcieport 0006:00:00.0: AER: Multiple Corrected error received: id=0640 [ 24.358708] pcieport 0006:00:00.0: AER: Multiple Corrected error received: id=0640 [ 24.358800] pcieport 0006:00:00.0: AER: Multiple Corrected error received: id=0640 [ 24.358892] pcieport 0006:00:00.0: AER: Multiple Corrected error received: id=0640 Below is the test result with the original code. <remove card> pcieport_0006:00:00.0:_AER:_Multiple_Corrected_error_received:_id=0640 pcieport 0006:01:00.0: PCIe Bus Error: severity=Corrected, type=Transaction Layer, id=0100(Receiver ID) pcieport 0006:01:00.0: device [10b5:8732] error status/mask=00002000/0000c000 pcieport 0006:01:00.0: [13] Advisory Non-Fatal pcieport 0006:02:08.0: PCIe Bus Error: severity=Corrected, type=Transaction Layer, id=0240(Receiver ID) pcieport 0006:02:08.0: device [10b5:8732] error status/mask=00002000/0000c000 pcieport 0006:02:08.0: [13] Advisory Non-Fatal pcieport 0006:03:00.0: PCIe Bus Error: severity=Corrected, type=Transaction Layer, id=0300(Receiver ID) pcieport 0006:03:00.0: device [10b5:8732] error status/mask=00002000/0000c000 pcieport 0006:03:00.0: [13] Advisory Non-Fatal pcieport 0006:04:00.0: PCIe Bus Error: severity=Corrected, type=Transaction Layer, id=0400(Receiver ID) pcieport 0006:04:00.0: device [10b5:8732] error status/mask=00002000/0000c000 pcieport 0006:04:00.0: [13] Advisory Non-Fatal pcieport 0006:06:08.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0640(Receiver ID) pcieport 0006:06:08.0: device [10b5:8749] error status/mask=00002001/0000c000 pcieport 0006:06:08.0: [ 0] Receiver Error pcieport 0006:06:08.0: [13] Advisory Non-Fatal pcieport 0006:06:08.0: Error of this Agent(0640) is reported first pcieport 0006:00:00.0: AER: Multiple Corrected error received: id=0640 pcieport 0006:06:09.0: PCIe Bus Error: severity=Corrected, type=Transaction Layer, id=0648(Receiver ID) pcieport 0006:06:09.0: device [10b5:8749] error status/mask=00002000/00008000 pcieport 0006:06:09.0: [13] Advisory Non-Fatal pcieport 0006:06:10.0: PCIe Bus Error: severity=Corrected, type=Transaction Layer, id=0680(Receiver ID) pcieport 0006:06:10.0: device [10b5:8749] error status/mask=00002000/0000c000 pcieport 0006:06:10.0: [13] Advisory Non-Fatal pcieport 0006:06:11.0: PCIe Bus Error: severity=Corrected, type=Transaction Layer, id=0688(Receiver ID) pcieport 0006:06:11.0: device [10b5:8749] error status/mask=00002000/00008000 pcieport 0006:06:11.0: [13] Advisory Non-Fatal pcieport 0006:06:12.0: PCIe Bus Error: severity=Corrected, type=Transaction Layer, id=0690(Receiver ID) pcieport 0006:06:12.0: device [10b5:8749] error status/mask=00002000/00008000 pcieport 0006:06:12.0: [13] Advisory Non-Fatal pcieport 0006:00:00.0: AER: Multiple Corrected error received: id=0640 pcieport 0006:00:00.0: AER: Multiple Corrected error received: id=0640 pcieport 0006:00:00.0: AER: Multiple Corrected error received: id=0640 pcieport 0006:00:00.0: AER: Multiple Corrected error received: id=0640 / # -- Sinan Kaya Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html