On Fri, 2013-09-13 at 18:49 +1000, Alexey Kardashevskiy wrote: > On 09/13/2013 07:23 AM, Alex Williamson wrote: > > So far we've succeeded at making KVM and VFIO mostly unaware of each > > other, but there's any important point where that breaks down. Intel > > VT-d hardware may or may not support snoop control. When snoop > > control is available, intel-iommu promotes No-Snoop transactions on > > PCIe to be cache coherent. That allows KVM to handle things like the > > x86 WBINVD opcode as a nop. When the hardware does not support this, > > KVM must implement a hardware visible WBINVD for the guest. > > > > We could simply let userspace tell KVM how to handle WBINVD, but it's > > privileged for a reason. Allowing an arbitrary user to enable > > physical WBINVD gives them a more access to the hardware. Previously, > > this has only been enabled for guests supporting legacy PCI device > > assignment. In such cases it's necessary for proper guest execution. > > We therefore create a new KVM-VFIO virtual device. The user can add > > and remove VFIO groups to this device via file descriptors. KVM > > makes use of the VFIO external user interface to validate that the > > user has access to physical hardware and gets the coherency state of > > the IOMMU from VFIO. This provides equivalent functionality to > > legacy KVM assignment, while keeping (nearly) all the bits isolated. > > > > The one intrusion is the resulting flag indicating the coherency > > state. For this RFC it's placed on the x86 kvm_arch struct, however > > I know POWER has interest in using the VFIO external user interface, > > and I'm hoping we can share a common KVM-VFIO device. Perhaps they > > care about No-Snoop handling as well or the code can be #ifdef'd. > > > POWER does not support (at least boos3s - "server", not sure about others) > this cache-non-coherent stuff at all. Then it's easy for your IOMMU API interface to return always cache coherent or never cache coherent or whatever ;) > Regarding reusing this device with external API for POWER - I posted a > patch which introduces KVM device to link KVM with IOMMU but besides the > list of groups registered in KVM, it also provides the way to find a group > by LIOBN (logical bus number) which is used in DMA map/unmap hypercalls. So > in my case kvm_vfio_group struct needs LIOBN and it would be nice to have > there window_size too (for a quick boundary check). I am not sure we want > to mix everything here. > > It is in "[PATCH v10 12/13] KVM: PPC: Add support for IOMMU in-kernel > handling" if you are interested (kvmppc_spapr_tce_iommu_device). Yes, I stole the code to get the vfio symbols from your code. The convergence I was hoping to achieve is that KVM doesn't really want to know about VFIO and vica versa. We can therefore at least limit the intrusion by sharing a common device. Obviously for you it will need some extra interfaces to associate an LIOBN to a group, but we keep both the kernel an userspace cleaner by avoiding duplication where we can. Is this really not extensible to your usage? Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html