> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Sent: Thursday, December 9, 2021 4:37 PM > > On Thu, Dec 09 2021 at 05:23, Kevin Tian wrote: > >> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > >> I don't see anything wrong with that. A subdevice is it's own entity and > >> VFIO can chose the most conveniant representation of it to the guest > >> obviously. > >> > >> How that is backed on the host does not really matter. You can expose > >> MSI-X to the guest with a INTx backing as well. > >> > > > > Agree with this point. How the interrupts are represented to the guest > > is orthogonal to how the backend resource is allocated. Physically MSI-X > > and IMS can be enabled simultaneously on an IDXD device. Once > > dynamic allocation is allowed for both, either one can be allocated for > > a subdevice (with only difference on supported #subdevices). > > > > When an interrupt resource is exposed to the guest with the same type > > (e.g. MSI-on-MSI or IMS-on-IMS), it can be also passed through to the > > guest as long as a hypercall machinery is in place to get addr/data pair > > from the host (as you suggested earlier). > > As I pointed out in the conclusion of this thread, IMS is only going to > be supported with interrupt remapping in place on both host and guest. I still need to read the last few mails but thanks for pointing it out now. > > As these devices are requiring a vIOMMU on the guest anyway (PASID, User > IO page tables), the required hypercalls are part of the vIOMMU/IR > implementation. If you look at it from the irqdomain hierarchy view: > > |- PCI-MSI > VECTOR -- [v]IOMMU/IR -|- PCI-MSI-X > |- PCI-IMS > > So host and guest use just the same representation which makes a ton of > sense. > > There are two places where this matters: > > 1) The activate() callback of the IR domain > > 2) The irq_set_affinity() callback of the irqchip associated with the > IR domain > > Both callbacks are allowed to fail and the error code is handed back to > the originating call site. > > If you look at the above hierarchy view then MSI/MSI-X/IMS are all > treated in exactly the same way. It all becomes the common case. > > No? > Yes, I think above makes sense. For a new guest OS which supports this enlightened hierarchy the same machinery works for all type of interrupt storages and we have a failure path from host to guest in case of host-side resource shortage. And no trap is required on guest access to the interrupt storage. A legacy guest OS which doesn't support the enlightened hierarchy can only use MSI/MSI-X which is still trapped. But with vector reallocation support from your work the situation already improves a lot than current awkward way in VFIO (free all previous vectors and then re-allocate). Overall I think this is a good modeling. Thanks Kevin