Hi Alex, Nathan has been running a workload on the 5.14 kernel + the test patch, and has ran into some interesting softlockups and hardlockups. The first, happened on a secondary server running a Windows VM, with 7 (of 10) 1080TI GPUs passed through. Full dmesg: https://paste.ubuntu.com/p/Wx5hCBBXKb/ There isn't any "irq x: nobody cared" messages, and the crashkernel gets stuck in the usual copying IR tables from dmar, which suggests an ongoing interrupt storm. Nathan disabled "kernel.hardlockup_panic = 1" sysctl, and managed to reproduce the issue again, suggesting that we get stuck in kernel space for too long without the ability for interrupts to be serviced. It starts with the NIC hitting a tx queue timeout, and then does a NMI to unwind the stack of each CPU, although the stacks don't appear to indicate where things are stuck. The server then remains softlocked, and keeps unwinding stacks every 26 seconds or so, until it eventually hardlockups. Full dmesg: https://people.canonical.com/~mruffell/sf314568/1080TI_hardlockup.txt The next interesting thing to report is when Nathan started the same Windows VM on the primary host we have been debugging on, with the 8x 2080TI GPUs. Nathan experienced a stuck VM, with the host responding just fine. When Nathan reset the VM, he got 4x "irq xx: nobody cared" messages on IRQs 25, 27, 29 and 31, which at the time corresponded to the PEX 8747 upstream PCI switches. Interestingly, Nathan also observed 2x GPU Audio devices sharing the same IRQ line as the upstream PCI switch, although Nathan mentioned this only occured very briefly, and the GPU audio devices were re-assigned different IRQs shortly afterward. Full dmesg: https://paste.ubuntu.com/p/C2V4CY3yjZ/ Output showing upstream ports belonging to those IRQs: https://paste.ubuntu.com/p/6fkSbyFNWT/ Full lspci: https://paste.ubuntu.com/p/CTX5kbjpRP/ Let us know if you would like any additional debug information. As always, we are happy to test patches out. Thanks, Matthew