RE: [patch 21/32] NTB/msi: Convert to msi_on_each_desc()

"Tian, Kevin" <kevin.tian@xxxxxxxxx> · Thu, 9 Dec 2021 12:31:05 +0000

> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Sent: Thursday, December 9, 2021 4:37 PM
> 
> On Thu, Dec 09 2021 at 05:23, Kevin Tian wrote:
> >> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> >> I don't see anything wrong with that. A subdevice is it's own entity and
> >> VFIO can chose the most conveniant representation of it to the guest
> >> obviously.
> >>
> >> How that is backed on the host does not really matter. You can expose
> >> MSI-X to the guest with a INTx backing as well.
> >>
> >
> > Agree with this point. How the interrupts are represented to the guest
> > is orthogonal to how the backend resource is allocated. Physically MSI-X
> > and IMS can be enabled simultaneously on an IDXD device. Once
> > dynamic allocation is allowed for both, either one can be allocated for
> > a subdevice (with only difference on supported #subdevices).
> >
> > When an interrupt resource is exposed to the guest with the same type
> > (e.g. MSI-on-MSI or IMS-on-IMS), it can be also passed through to the
> > guest as long as a hypercall machinery is in place to get addr/data pair
> > from the host (as you suggested earlier).
> 
> As I pointed out in the conclusion of this thread, IMS is only going to
> be supported with interrupt remapping in place on both host and guest.

I still need to read the last few mails but thanks for pointing it out now.

> 
> As these devices are requiring a vIOMMU on the guest anyway (PASID, User
> IO page tables), the required hypercalls are part of the vIOMMU/IR
> implementation. If you look at it from the irqdomain hierarchy view:
> 
>                          |- PCI-MSI
>   VECTOR -- [v]IOMMU/IR -|- PCI-MSI-X
>                          |- PCI-IMS
> 
> So host and guest use just the same representation which makes a ton of
> sense.
> 
> There are two places where this matters:
> 
>   1) The activate() callback of the IR domain
> 
>   2) The irq_set_affinity() callback of the irqchip associated with the
>      IR domain
> 
> Both callbacks are allowed to fail and the error code is handed back to
> the originating call site.
> 
> If you look at the above hierarchy view then MSI/MSI-X/IMS are all
> treated in exactly the same way. It all becomes the common case.
> 
> No?
> 

Yes, I think above makes sense. 

For a new guest OS which supports this enlightened hierarchy the same
machinery works for all type of interrupt storages and we have a
failure path from host to guest in case of host-side resource shortage.
And no trap is required on guest access to the interrupt storage.

A legacy guest OS which doesn't support the enlightened hierarchy
can only use MSI/MSI-X which is still trapped. But with vector 
reallocation support from your work the situation already improves 
a lot than current awkward way in VFIO (free all previous vectors 
and then re-allocate).

Overall I think this is a good modeling.

Thanks
Kevin