Hi, I'm Kexin. I'm working on Linux nvme system. Some of my test triggered PCI AER uncorrectable errors leading to slow pci_bus_read_config_XXX, which took 1.3 seconds for every access. This caused a lot of CPU scheduling issues, for example, 'Thread not rescheduled for xxx ms after irq xxx' or 'Softirq x took xxx ms', and finally kernel reboot due to soft lockup. Definitely there's hardware issue, but could kernel take some actions to avoid kernel from crashing and exit this gracefully ? My current system is using 4.4.182. Thanks, Kexin