On Thu, May 23, 2024 at 01:43:45AM +0000, Tian, Kevin wrote: > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > Sent: Wednesday, May 22, 2024 9:39 PM > > VIOMMU contains: > > - A nesting parent > > - A KVM > > - Any global per-VM data the driver needs > > * In ARM case this is VMID, sometimes shared with KVM > > In which case is it not shared with KVM? I had the impression that > VMID always comes from KVM in this VCMDQ usage. 😊 Not actually. I guess that shared VMID is for BTM. > > On ARM the S2 is not divorced from the VIOMMU, ARM requires a single > > VMID, shared with KVM, and localized to a single VM for some of the > > bypass features (vBTM, vCMDQ). So to attach a S2 you actually have to > > attach the VIOMMU to pick up the correct VMID. > > > > I imagine something like this: > > hwpt_alloc(deva, nesting_parent=true) = shared_s2 > > viommu_alloc(deva, shared_s2) = viommu1 > > viommu_alloc(devb, shared_s2) = viommu2 > > hwpt_alloc(deva, viommu1, vste) = deva_vste > > hwpt_alloc(devb, viommu2, vste) = devb_vste > > attach(deva, deva_vste) > > attach(devb, devb_vste) > > attach(devc, shared_s2) > > I wonder whether we want to make viommu as the 1st-class citizen > for any nested hwpt if it is desirable to enable it even for VT-d which > lacks of a hw viommu concept at the moment. I think Jason is completely using SMMU as an example here. Also FWIW, I am trying a core-allocated core-managed viommu for IOMMU_VIOMMU_TYPE_DEFAULT. So VT-d driver doesn't need to hold a viommu while VMM could still allocate one if it wants. And the VIOMMU interface can provide some helpers if driver wants some info from the core-managed viommu: a virtual dev ID to physical dev ID (returning device pointer) translation for example. And we can add more after we brain storm. Sample change: @@ -623,6 +625,18 @@ struct iommu_ops { + * @viommu_alloc: Allocate an iommufd_viommu associating to a nested parent + * @domain as a user space IOMMU instance for HW-accelerated + * features from the physical IOMMU behind the @dev. The + * @viommu_type must be defined in include/uapi/linux/iommufd.h + * It is suggested to call iommufd_viommu_alloc() helper for + * a bundled allocation of the core and the driver structures, + * using the given @ictx pointer. + * @default_viommu_ops: Driver can choose to use a default core-allocated core- + * managed viommu object by providing a default viommu ops. + * Otherwise, i.e. for a driver-managed viommu, viommu_ops + * should be passed in via iommufd_viommu_alloc() helper in + * its own viommu_alloc op. [..] +int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd) +{ ... + if (cmd->type == IOMMU_VIOMMU_TYPE_DEFAULT) { + viommu = __iommufd_viommu_alloc( + ucmd->ictx, sizeof(*viommu), + domain->ops->default_viommu_ops); + } else { + if (!domain->ops->viommu_alloc) { + rc = -EOPNOTSUPP; + goto out_put_hwpt; + } + + viommu = domain->ops->viommu_alloc(domain, idev->dev, + ucmd->ictx, cmd->type); + } [..] // Helper: +struct device * +iommufd_viommu_find_device(struct iommufd_viommu *viommu, u64 id); > > The driver will then know it should program three different VMIDs for > > the same S2 page table, which matches the ARM expectation for > > VMID. That is to say we'd pass in the viommu as the pt_id for the > > iommu_hwpt_alloc. The viommu would imply both the S2 page table and > > any meta information like VMID the driver needs. > > Can you elaborate the aspect about "three different VMIDs"? They are > all for the same VM hence sharing the same VMID per the earlier > description. This is also echo-ed in patch14: > > tegra241_cmdqv_viommu_alloc() > vintf->vmid = smmu_domain->vmid; The design in the series is still old using a 1:1 relationship between a viommu and an S2 domain. I think the "three" is from his SMMU example above? Leaving it to Jason to reply though. > > now. If someone needs them linked someday we can add a viommu_id to > > the create pri queue command. > > I'm more worried about the potential conflict between the vqueue > object here and the fault queue object in Baolu's series, if we want > to introduce vIOMMU concept to platforms which lack of the hw > support. I actually see one argument is whether we should use a vqueue v.s. Baolu's fault queue object, and a counter argument whether we should also use vqueue for viommu invalidation v.s an array- based invalidation request that we have similarly for HWPT... Thanks Nicolin