On Wed, Nov 22, 2017 at 03:44:55PM +1100, David Gibson wrote: > On Tue, Nov 21, 2017 at 09:28:46PM -0700, Alex Williamson wrote: > > On Wed, 22 Nov 2017 15:09:32 +1100 > > Alexey Kardashevskiy <aik@xxxxxxxxx> wrote: > > > > > By default VFIO disables mapping of MSIX BAR to the userspace as > > > the userspace may program it in a way allowing spurious interrupts; > > > instead the userspace uses the VFIO_DEVICE_SET_IRQS ioctl. > > > > > > This works fine as long as the system page size equals to the MSIX > > > alignment requirement which is 4KB. However with a bigger page size > > > the existing code prohibits mapping non-MSIX parts of a page with MSIX > > > structures so these parts have to be emulated via slow reads/writes on > > > a VFIO device fd. If these emulated bits are accessed often, this has > > > serious impact on performance. > > > > > > This adds an ioctl to the vfio-pci device which hides the sparse > > > capability and allows the userspace to map a BAR with MSIX structures. > > > > So the user is in control of telling the kernel whether they're allowed > > to mmap the msi-x vector table. That makes absolutely no sense. If > > you're trying to figure out how userspace knows whether to implicitly > > avoid mmap'ing the msix region, I think there are far better ways in > > the existing region info ioctl. We could use a flag, or maybe the > > existence of a capability chain pointer, or a new capability. But > > absolutely not this. The kernel needs to decide whether it's going to > > let the user do this, not the user. Thanks, > > No, it doesn't. This is actually the approach we discussed in Prague. > > Remember that intercepting access to the MSI-X table is not a host > safety / security issue. It's just that without that we won't wire up How is that not a security issue? Having an guest or an user-space access to muck with the MSI-X vectors allows them to do some form of memory writes using the IOAPIC. > the device's MSI-X vectors properly so they won't work. > > Basically the decision here is between > > A) Allow MSI-X configuration via standard PCI mechanisms, at the > cost of making access slow for any registers sharing a page with > the MSI-X table. > > or > > B) Make access to BAR registers sharing a page with the MSI-X table > fast, at the cost of requiring some alternative mechanism to > configure MSI-X vectors. > > And that is a tradeoff that it is reasonable for userspace to make. > > In the case of KVM guests, the decision depends entirely on the > *guest* platform. Usually we need (A) because the guest expects to be > able to poke the MSI-X table in the usual way. However for PAPR > guests, there's an alternative mechanism via an RTAS call, which means > we can use (B). > > The host kernel can't make this decision, because it doesn't know the > guest platform (well, KVM might, but VFIO doesn't). > > A userspace VFIO program could also elect for (B) if it does care > about performance of access to registers in the same BAR as the MSI-X > table, but doesn't need MSI-X for example. > > -- > David Gibson | I'll have my music baroque, and my code > david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ > | _way_ _around_! > http://www.ozlabs.org/~dgibson