On Mon, Sep 14, 2020 at 09:22:47AM -0700, Raj, Ashok wrote: > Hi Jason, > > On Mon, Sep 14, 2020 at 10:47:38AM -0300, Jason Gunthorpe wrote: > > On Mon, Sep 14, 2020 at 03:31:13PM +0200, Jean-Philippe Brucker wrote: > > > > > > Jason suggest something like /dev/sva. There will be a lot of other > > > > subsystems that could benefit from this (e.g vDPA). > > > > > > Do you have a more precise idea of the interface /dev/sva would provide, > > > how it would interact with VFIO and others? vDPA could transport the > > > generic iommu.h structures via its own uAPI, and call the IOMMU API > > > directly without going through an intermediate /dev/sva handle. > > > > Prior to PASID IOMMU really only makes sense as part of vfio-pci > > because the iommu can only key on the BDF. That can't work unless the > > whole PCI function can be assigned. It is hard to see how a shared PCI > > device can work with IOMMU like this, so may as well use vfio. > > > > SVA and various vIOMMU models change this, a shared PCI driver can > > absoultely work with a PASID that is assigned to a VM safely, and > > actually don't need to know if their PASID maps a mm_struct or > > something else. > > Well, IOMMU does care if its a native mm_struct or something that belongs > to guest. Because you need ability to forward page-requests and pickup > page-responses from guest. Since there is just one PRQ on the IOMMU and > responses can't be sent directly. You have to depend on vIOMMU type > interface in guest to make all of this magic work right? Yes, IOMMU cares, but not the PCI Driver. It just knows it has a PASID. Details on how page faultings is handled or how the mapping is setup is abstracted by the PASID. > > This new PASID allocator would match the guest memory layout and > > Not sure what you mean by "match guest memory layout"? > Probably, meaning first level is gVA or gIOVA? It means whatever the qemu/viommu/guest/etc needs across all the IOMMU/arch implementations. Basically, there should only be two ways to get a PASID - From mm_struct that mirrors the creating process - Via '/dev/sva' which has an complete interface to create and control a PASID suitable for virtualization and more VFIO should not have its own special way to get a PASID. Jason