On Sun, May 12, 2024 at 11:03:53AM -0300, Jason Gunthorpe wrote: > On Fri, Apr 12, 2024 at 08:47:01PM -0700, Nicolin Chen wrote: > > Add a new iommufd_viommu core structure to represent a vIOMMU instance in > > the user space, typically backed by a HW-accelerated feature of an IOMMU, > > e.g. NVIDIA CMDQ-Virtualization (an ARM SMMUv3 extension) and AMD Hardware > > Accelerated Virtualized IOMMU (vIOMMU). > > I expect this will also be the only way to pass in an associated KVM, > userspace would supply the kvm when creating the viommu. > > The tricky bit of this flow is how to manage the S2. It is necessary > that the S2 be linked to the viommu: > > 1) ARM BTM requires the VMID to be shared with KVM > 2) AMD and others need the S2 translation because some of the HW > acceleration is done inside the guest address space > > I haven't looked closely at AMD but presumably the VIOMMU create will > have to install the S2 into a DID or something? > > So we need the S2 to exist before the VIOMMU is created, but the > drivers are going to need some more fixing before that will fully > work. > > Does the nesting domain create need the viommu as well (in place of > the S2 hwpt)? That feels sort of natural. Yes, I had a similar thought initially: each viommu is backed by a nested IOMMU HW, and a special HW accelerator like VCMDQ could be treated as an extension on top of that. It might not be very straightforward like the current design having vintf<->viommu and vcmdq <-> vqueue though... In that case, we can then support viommu_cache_invalidate, which is quite natural for SMMUv3. Yet, I recall Kevin said that VT-d doesn't want or need that. > There is still a lot of fixing before everything can work fully, but > do we need to make some preperations here in the uapi? Like starting > to thread the S2 through it as I described? > > Kevin, does Intel forsee any viommu needs on current/future Intel HW? > I assume you are thinking about invalidation queue bypass like > everyone else. I think it is an essential feature for vSVA. > > > A driver should embed this core structure in its driver viommu structure > > and call the new iommufd_viommu_alloc() helper to allocate a core/driver > > structure bundle and fill its core viommu->ops: > > struct my_driver_viommu { > > struct iommufd_viommu core; > > .... > > }; > > > > static const struct iommufd_viommu_ops my_driver_viommu_ops = { > > .free = my_driver_viommu_free, > > }; > > > > struct my_driver_viommu *my_viommu = > > iommufd_viommu_alloc(my_driver_viommu, core); > > Why don't we have an ictx here anyhow? The caller has it? Just pass it > down and then it is normal: > > my_viommu = iommufd_object_alloc_elm(ictx, my_viommu, IOMMUFD_OBJ_HWPT_VIOMMU, core.obj); Oh, in that case, we probably don't need a level-3 obj allocator that was previously missing ictx to allocate an obj->id. Thanks Nicolin