On Fri, Sep 27, 2024 at 08:59:25AM -0300, Jason Gunthorpe wrote: > On Thu, Sep 26, 2024 at 11:02:37PM -0700, Nicolin Chen wrote: > > On Fri, Sep 27, 2024 at 01:38:08PM +0800, Yi Liu wrote: > > > > > Does it mean each vIOMMU of VM can only have > > > > > one s2 HWPT? > > > > > > > > Giving some examples here: > > > > - If a VM has 1 vIOMMU, there will be 1 vIOMMU object in the > > > > kernel holding one S2 HWPT. > > > > - If a VM has 2 vIOMMUs, there will be 2 vIOMMU objects in the > > > > kernel that can hold two different S2 HWPTs, or share one S2 > > > > HWPT (saving memory). > > > > > > So if you have two devices assigned to a VM, then you may have two > > > vIOMMUs or one vIOMMU exposed to guest. This depends on whether the two > > > devices are behind the same physical IOMMU. If it's two vIOMMUs, the two > > > can share the s2 hwpt if their physical IOMMU is compatible. is it? > > > > Yes. > > > > > To achieve the above, you need to know if the physical IOMMUs of the > > > assigned devices, hence be able to tell if physical IOMMUs are the > > > same and if they are compatible. How would userspace know such infos? > > > > My draft implementation with QEMU does something like this: > > - List all viommu-matched iommu nodes under /sys/class/iommu: LINKs > > - Get PCI device's /sys/bus/pci/devices/0000:00:00.0/iommu: LINK0 > > - Compare the LINK0 against the LINKs > > > > We so far don't have an ID for physical IOMMU instance, which can > > be an alternative to return via the hw_info call, otherwise. > > We could return the sys/class/iommu string from some get_info or > something I had a patch doing an ida alloc for each iommu_dev and returning the ID via hw_info. It wasn't useful at that time, as we went for fail-n-retry for S2 HWPT allocations on multi-pIOMMU platforms. Perhaps that could be cleaner than returning a string? > > For compatibility to share a stage-2 HWPT, basically we would do > > a device attach to one of the stage-2 HWPT from the list that VMM > > should keep. This attach has all the compatibility test, down to > > the IOMMU driver. If it fails, just allocate a new stage-2 HWPT. > > Ideally just creating the viommu should validate the passed in hwpt is > compatible without attaching. I think I should add a validation between hwpt->domain->owner and dev_iommu_ops(idev->dev) then! Thanks Nicolin