On Wed, 15 Sep 2021 16:44:38 +1200 Matthew Ruffell <matthew.ruffell@xxxxxxxxxxxxx> wrote: > On 15/09/21 4:43 am, Alex Williamson wrote: > > > > FWIW, I have access to a system with an NVIDIA K1 and M60, both use > > this same switch on-card and I've not experienced any issues assigning > > all the GPUs to a single VM. Topo: > > > > +-[0000:40]-+-02.0-[42-47]----00.0-[43-47]--+-08.0-[44]----00.0 > > | +-09.0-[45]----00.0 > > | +-10.0-[46]----00.0 > > | \-11.0-[47]----00.0 > > \-[0000:00]-+-03.0-[04-07]----00.0-[05-07]--+-08.0-[06]----00.0 > > \-10.0-[07]----00.0 I've actually found that the above configuration, assigning all 6 GPUs to a VM reproduces this pretty readily by simply rebooting the VM. In my case, I don't have the panic-on-warn/oops that must be set on your kernel, so the result is far more benign, the IRQ gets masked until it's re-registered. The fact that my upstream ports are using MSI seems irrelevant. Adding debugging to the vfio-pci interrupt handler, it's correctly deferring the interrupt as the GPU device is not identifying itself as the source of the interrupt via the status register. In fact, setting the disable INTx bit in the GPU command register while the interrupt storm occurs does not stop the interrupts. The interrupt storm does seem to be related to the bus resets, but I can't figure out yet how multiple devices per switch factors into the issue. Serializing all bus resets via a mutex doesn't seem to change the behavior. I'm still investigating, but if anyone knows how to get access to the Broadcom datasheet or errata for this switch, please let me know. Thanks, Alex