On 2017/10/21 0:36, Alex Williamson wrote: > On Fri, 20 Oct 2017 13:04:43 +0800 > Kenneth Lee <liguozhu@xxxxxxxxxxxxx> wrote: > >> On Thu, Oct 19, 2017 at 12:56:04PM -0600, Alex Williamson wrote: >>> Date: Thu, 19 Oct 2017 12:56:04 -0600 >>> From: Alex Williamson <alex.williamson@xxxxxxxxxx> >>> To: Kenneth Lee <liguozhu@xxxxxxxxxxxxx> >>> CC: Jon Masters <jcm@xxxxxxxxxxxxxx>, Jon Masters <jcm@xxxxxxxxxx>, >>> Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>, liubo95@xxxxxxxxxx, >>> xuzaibo@xxxxxxxxxx >>> Subject: Re: To extend the feature of vfio-mdev >>> Message-ID: <20171019125604.26577eda@xxxxxxxxxx> >>> >>> >>> Hi Kenneth, >>> >>> On Thu, 19 Oct 2017 12:13:46 +0800 >>> Kenneth Lee <liguozhu@xxxxxxxxxxxxx> wrote: >>> >>>> Dear Alex, >>>> >>>> I hope this mail finding you well. This is to discuss the possibility to >>>> extend the vfio-mdev feature to form a general accelerator framework for >>>> Linux. I name the framework as "WrapDrive". >>>> >>>> I made a presentation on Linaro Connect SFO17 (ref: >>>> http://connect.linaro.org/resource/sfo17/sfo17-317/), and discussed it >>>> with Jon Master. He said he can connect us for further cooperation. >>>> >>>> The idea of WrapDrive is to create a mdev for every user application so >>>> they can share the same PF or VF facility. This is important to >>>> accelerators, because we cannot create a VF for every process in most >>>> cases. >>>> >>>> WrapDrive need to add the following feature upon vfio and vfio-mdev >>>> >>>> 1. Set unified abi in the sysfs so the same type of >>>> accelerator/algorithm can be managed from the user space >>> >>> We already have a defined, standard mdev interface where vendor drivers >>> can add additional attributes. If warpdrive is a wrapper around >>> vfio-mdev, can't it define standard attributes w/o vfio changes? >> >> Yes. We just define necessary attributes so the application with same >> requirements can take it as a whole. >> >>> >>>> 2. Let the mdev use the parent dev's iommu facility >>> >>> What prevents you from doing this now? The mdev vendor driver is >>> entirely responsible for managing the DMA of each mdev device. Mdev >>> vGPUs use the GTT of the parent device to do this today, vfio only >>> tracks user mappings and provides pinned pages to the vendor driver on >>> request. IOW, this sounds like something within the scope of the >>> vendor driver, not the vfio-mdev core. >> >> I'm sorry I don't know much how i915 work. But according to the implementation >> of vfio_iommu_type1_attach_group, the mdev's iommu_group is added to the >> external_domain list. But vfio_iommu_map() iommu_map() only the domain list. >> >> Therefore, if ioctl(VFIO_IOMMU_MAP_DMA) to the mdev's iommu_group, it won't do >> anything. What is mdev vendor driver expected to do? Should it register to the >> notification chain or adopted another interface to do so? Is this intended by >> the mdev driver? I think it may be necessary to provide some standard way by >> default. > > This is the \mediation\ of a mediated driver, it needs to be aware of > any DMA that the device might perform within the user address space and > request pinning of those pages through the mdev interface. > Additionally, when an IOMMU is active on the host, it's the mdev vendor > driver's responsibility to setup any necessary IOMMU mappings for the > mdev. The mdev device works within the IOMMU context of the parent > device. There is no magic "map everything" option with mdev as there is > for IOMMU isolated devices. Part of the idea of mdev is that isolation > can be provided by device specific means, such as GTTs for vGPUs. We > currently have only an invalidation notifier such that vendor drivers > can invalidate pinned mappings when unmapped by the user, the mapping > path presumes device mediation to explicitly request page pinning based > on device configuration. > >>>> 3. Let iommu driver accept more than one iommu_domain for the same >>>> device. The substream id or pasid should be support for that >>> >>> You're really extending the definition of an iommu_domain to include >>> PASID to do this, I don't think it makes sense in the general case. So >>> perhaps you're talking about a PASID management layer sitting on top of >>> an iommu_domain. AIUI for PCIe, a device has a requester ID which is >>> used to find the context entry for that device. The IOMMU may support >>> PASID, which would cause a first level lookup via those set of page >>> tables, or it might only support second level translation. The >>> iommu_domain is a reflection of that initial, single requester ID. >> >> Maybe I misunderstand this. But the IOMMU hardware, such as SMMU for ARM, >> support multiple page table and is referred by something like ASID. If we should >> support it in Linux, iommu_domain should be the best choice (no matter you call >> it cookie or id or something else). Or where you can get a object referring to it? > > For PASID, a PASID is unique only within the requester ID. I don't > know of anything equivalent to your ASID within PCIe. > >>>> 4. Support SVM in vfio and iommu >>> >>> There are numerous discussions about this ongoing. >> >> Yes. I just said we needed the support. > > It seems like this is the crux of your design if you're looking for > IOMMU based isolation based on PASID with dynamic mapping of the > process address space. There was quite a lot of discussion about this > at the PCI/IOMMU/VFIO uconf at LPC this year and the details of the > IOMMU API interfaces are currently being developed. This has > implications for both directly assigned vfio devices as well as the > potential to further enhance vfio-mdev such that DMA isolation and > mapping might be managed in a common way while the vendor driver > manages only the partitioning of the device. > Yes, actually when vfio-mdev was first introduced the PASID feature was not popular. So that we have to use a very complex vendor driver for supporting virtual device while only one physical device. But now things changed, PASID is being popular. And with PASID, you may think the device(with a pasid number) is actually a normal physical device which can use full iommu. In theory, for such device we can use vfio-pci or vfio-platform. But vfio-mdev provides a dynamic create interface which is very useful and reusable. Enhance vfio-mdev such that DMA isolation and mapping might be managed in a common way would be very important. -- Thanks, Liubo