On Fri, Apr 21, 2023 at 02:58:01PM -0300, Jason Gunthorpe wrote: > > which for practical purposes in this context means an ITS. > > I haven't delved into it super detail, but.. my impression was.. > > The ITS page only becomes relavent to the IOMMU layer if the actual > IRQ driver calls iommu_dma_prepare_msi() Nicolin and I sat down and traced this through, this explanation is almost right... irq-gic-v4.c is some sub module of irq-gic-v3-its.c so it does end up calling iommu_dma_prepare_msi() however.. qemu will setup the ACPI so that VM thinks the ITS page is at 0x08080000. I think it maps some dummy CPU memory to this address. iommufd will map the real ITS page at MSI_IOVA_BASE = 0x8000000 (!!) and only into the IOMMU qemu will setup some RMRR thing to make 0x8000000 1:1 at the VM's IOMMU When DMA API is used iommu_dma_prepare_msi() is called which will select a MSI page address that avoids the reserved region, so it is some random value != 0x8000000 and maps the dummy CPU page to it. The VM will then do a MSI-X programming cycle with the S1 IOVA of the CPU page and the data. qemu traps this and throws away the address from the VM. The kernel sets up the interrupt and assumes 0x8000000 is the right IOVA. When VFIO is used iommufd in the VM will force the MSI window to 0x8000000 and instead of putting a 1:1 mapping we map the dummy CPU page and then everything is broken. Adding the reserved check is an improvement. The only way to properly fix this is to have qemu stop throwing away the address during the MSI-X programming. This needs to be programmed into the device instead. I have no idea how best to get there with the ARM GIC setup.. It feels really hard. Jason