Hi Kevin, On 12/9/21 4:21 AM, Tian, Kevin wrote: >> From: Jason Gunthorpe <jgg@xxxxxxxxxx> >> Sent: Wednesday, December 8, 2021 8:56 PM >> >> On Wed, Dec 08, 2021 at 08:33:33AM +0100, Eric Auger wrote: >>> Hi Baolu, >>> >>> On 12/8/21 3:44 AM, Lu Baolu wrote: >>>> Hi Eric, >>>> >>>> On 12/7/21 6:22 PM, Eric Auger wrote: >>>>> On 12/6/21 11:48 AM, Joerg Roedel wrote: >>>>>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote: >>>>>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@xxxxxxx> >>>>>>> Signed-off-by: Liu, Yi L<yi.l.liu@xxxxxxxxxxxxxxx> >>>>>>> Signed-off-by: Ashok Raj<ashok.raj@xxxxxxxxx> >>>>>>> Signed-off-by: Jacob Pan<jacob.jun.pan@xxxxxxxxxxxxxxx> >>>>>>> Signed-off-by: Eric Auger<eric.auger@xxxxxxxxxx> >>>>>> This Signed-of-by chain looks dubious, you are the author but the last >>>>>> one in the chain? >>>>> The 1st RFC in Aug 2018 >>>>> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018- >> August/032478.html) >>>>> said this was a generalization of Jacob's patch >>>>> >>>>> >>>>> [PATCH v5 01/23] iommu: introduce bind_pasid_table API function >>>>> >>>>> >>>>> >>>>> https://lists.linuxfoundation.org/pipermail/iommu/2018- >> May/027647.html >>>>> So indeed Jacob should be the author. I guess the multiple rebases got >>>>> this eventually replaced at some point, which is not an excuse. Please >>>>> forgive me for that. >>>>> Now the original patch already had this list of SoB so I don't know if I >>>>> shall simplify it. >>>> As we have decided to move the nested mode (dual stages) >> implementation >>>> onto the developing iommufd framework, what's the value of adding this >>>> into iommu core? >>> The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is >>> is bound to be replaced by /dev/iommu fellow API. >>> However until I can rebase on /dev/iommu code I am obliged to keep it to >>> maintain this integration, hence the RFC. >> Indeed, we are getting pretty close to having the base iommufd that we >> can start adding stuff like this into. Maybe in January, you can look >> at some parts of what is evolving here: >> >> https://github.com/jgunthorpe/linux/commits/iommufd >> https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership- >> v2 >> https://github.com/luxis1999/iommufd/commits/iommufd-v5.16-rc2 >> >> From a progress perspective I would like to start with simple 'page >> tables in userspace', ie no PASID in this step. >> >> 'page tables in userspace' means an iommufd ioctl to create an >> iommu_domain where the IOMMU HW is directly travesering a >> device-specific page table structure in user space memory. All the HW >> today implements this by using another iommu_domain to allow the IOMMU >> HW DMA access to user memory - ie nesting or multi-stage or whatever. > One clarification here in case people may get confused based on the > current iommu_domain definition. Jason brainstormed with us on how > to represent 'user page table' in the IOMMU sub-system. One is to > extend iommu_domain as a general representation for any page table > instance. The other option is to create new representations for user > page tables and then link them under existing iommu_domain. > > This context is based on the 1st option. and As Jason said in the bottom > we still need to sketch out whether it works as expected. 😊 > >> This would come along with some ioctls to invalidate the IOTLB. >> >> I'm imagining this step as a iommu_group->op->create_user_domain() >> driver callback which will create a new kind of domain with >> domain-unique ops. Ie map/unmap related should all be NULL as those >> are impossible operations. >> >> From there the usual struct device (ie RID) attach/detatch stuff needs >> to take care of routing DMAs to this iommu_domain. > Usage-wise this covers the guest IOVA requirements i.e. when the guest > kernel enables vIOMMU for kernel DMA-API mappings or for device > assignment to guest userspace. > > For intel this means optimization to the existing shadow-based vIOMMU > implementation. > > For ARM this actually enables guest IOVA usage for the first time (correct > me Eric?). Yes that's correct. This is the scope of this series (single PASID) > IIRC SMMU doesn't support caching mode while write-protecting > guest I/O page table was considered a no-go. So nesting is considered as > the only option to support that. that's correct too. No 'caching mode' provisionned in the SMMU spec. I thought it would just be a matter of adding 1 bit in an ID reg though. Thanks Eric > > and once 'user pasid table' is installed, this actually means guest SVA usage > can also partially work for ARM if I/O page fault is not incurred. > >> Step two would be to add the ability for an iommufd using driver to >> request that a RID&PASID is connected to an iommu_domain. This >> connection can be requested for any kind of iommu_domain, kernel owned >> or user owned. >> >> I don't quite have an answer how exactly the SMMUv3 vs Intel >> difference in PASID routing should be resolved. > For kernel owned the iommufd interface should be generic as the > vendor difference is managed by the kernel itself. > > For user owned we'll need new uAPIs for user to specify PASID. > As I replied in another thread only Intel currently requires it due to > mdev. But other vendors could also do so when they decide to > support mdev one day. > >> to get answers I'm hoping to start building some sketch RFCs for these >> different things on iommufd, hopefully in January. I'm looking at user >> page tables, PASID, dirty tracking and userspace IO fault handling as >> the main features iommufd must tackle. > Make sense. > >> The purpose of the sketches would be to validate that the HW features >> we want to exposed can work will with the choices the base is making. >> >> Jason > Thanks > Kevin