On Mon, 14 Sep 2020 13:33:54 -0300 Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > On Mon, Sep 14, 2020 at 09:22:47AM -0700, Raj, Ashok wrote: > > Hi Jason, > > > > On Mon, Sep 14, 2020 at 10:47:38AM -0300, Jason Gunthorpe wrote: > > > On Mon, Sep 14, 2020 at 03:31:13PM +0200, Jean-Philippe Brucker wrote: > > > > > > > > Jason suggest something like /dev/sva. There will be a lot of other > > > > > subsystems that could benefit from this (e.g vDPA). > > > > > > > > Do you have a more precise idea of the interface /dev/sva would provide, > > > > how it would interact with VFIO and others? vDPA could transport the > > > > generic iommu.h structures via its own uAPI, and call the IOMMU API > > > > directly without going through an intermediate /dev/sva handle. > > > > > > Prior to PASID IOMMU really only makes sense as part of vfio-pci > > > because the iommu can only key on the BDF. That can't work unless the > > > whole PCI function can be assigned. It is hard to see how a shared PCI > > > device can work with IOMMU like this, so may as well use vfio. > > > > > > SVA and various vIOMMU models change this, a shared PCI driver can > > > absoultely work with a PASID that is assigned to a VM safely, and > > > actually don't need to know if their PASID maps a mm_struct or > > > something else. > > > > Well, IOMMU does care if its a native mm_struct or something that belongs > > to guest. Because you need ability to forward page-requests and pickup > > page-responses from guest. Since there is just one PRQ on the IOMMU and > > responses can't be sent directly. You have to depend on vIOMMU type > > interface in guest to make all of this magic work right? > > Yes, IOMMU cares, but not the PCI Driver. It just knows it has a > PASID. Details on how page faultings is handled or how the mapping is > setup is abstracted by the PASID. > > > > This new PASID allocator would match the guest memory layout and > > > > Not sure what you mean by "match guest memory layout"? > > Probably, meaning first level is gVA or gIOVA? > > It means whatever the qemu/viommu/guest/etc needs across all the > IOMMU/arch implementations. > > Basically, there should only be two ways to get a PASID > - From mm_struct that mirrors the creating process > - Via '/dev/sva' which has an complete interface to create and > control a PASID suitable for virtualization and more > > VFIO should not have its own special way to get a PASID. "its own special way" is arguable, VFIO is just making use of what's being proposed as the uapi via its existing IOMMU interface. PASIDs are also a system resource, so we require some degree of access control and quotas for management of PASIDs. Does libvirt now get involved to know whether an assigned device requires PASIDs such that access to this dev file is provided to QEMU? How does the kernel validate usage or implement quotas when disconnected from device ownership? PASIDs would be an obvious DoS path if any user can create arbitrary allocations. If we can move code out of VFIO, I'm all for it, but I think it needs to be better defined than "implement magic universal sva uapi interface" before we can really consider it. Thanks, Alex