On Tue, 13 Aug 2024 13:43:41 -0300 Jason Gunthorpe <jgg@xxxxxxxx> wrote: > On Mon, Aug 12, 2024 at 11:00:40AM -0600, Alex Williamson wrote: > > These devices have an embedded interrupt controller which is programmed > > with guest physical MSI address/data, which doesn't work. We need > > vfio-pci kernel support to provide a device feature which disables > > virtualization of the MSI capability registers. Then we can do brute > > force testing for writes matching the MSI address, from which we can > > infer writes of the MSI data, replacing each with host physical values. > > > > This has only been tested on ath11k (0x1103), ath12k support is > > speculative and requires testing. Note that Windows guest drivers make > > use of multi-vector MSI which requires interrupt remapping support in > > the host. > > The way it is really supposed to work, is that the guest itself > controls/knows the MSI addr/data pairs and the interrupt remapping HW > makes that delegation safe since all the interrupt processing will be > qualified by the RID. > > Then the guest can make up the unique interrupts for MSI and any > internal "IMS" sources and we just let the guest directly write the > MSI/MSI-X and any IMS values however it wants. > > This hackery to capture and substitute the IMS programming is neat and > will solve this one device, but there are more IMS style devices in > the pipeline than will really need a full solution. How does the guest know to write a remappable vector format? How does the guest know the host interrupt architecture? For example why would an aarch64 guest program an MSI vector of 0xfee... if the host is x86? The idea of guest owning the physical MSI address space sounds great, but is it practical? Is it something that would be accomplished while this device is still relevant? > > + * The Windows driver makes use of multi-vector MSI, where our sanity test > > + * of the MSI data value must then mask off the vector offset for comparison > > + * and add it back to the host base data value on write. > > But is that really enough? If the vector offset is newly created then > that means the VM built a new interrupt that needs setup to be routed > into the VM?? Is that why you say it "requires interrupt remapping > support" because that setup is happening implicitly on x86? > > It looks like Windows is acting as I said Linux should, with a > "irq_chip" and so on to get the unique interrupt source a proper > unique addr/data pair... The Windows driver is just programming the MSI capability to use 16 vectors. We configure those vectors on the host at the time the capability is written. Whereas the Linux driver is only using a single vector and therefore writing the same MSI address and data at the locations noted in the trace, the Windows driver is writing different data values at different locations to make use of those vectors. This note is simply describing that we can't directly write the physical data value into the device, we need to determine which vector offset the guest is using and provide the same offset from the host data register value. I don't know that interrupt remapping is specifically required, but the MSI domain needs to support MSI_FLAG_MULTI_PCI_MSI and AFAIK that's only available with interrupt remapping on x86, ie. pci_alloc_irq_vectors() with max_vecs >1 and PCI_IRQ_MSI flags needs to work on the host to mirror the guest MSI configuration. Thanks, Alex