On Mon, Jun 10, 2024 at 09:04:46AM -0300, Jason Gunthorpe wrote: > On Fri, Jun 07, 2024 at 02:19:21PM -0700, Nicolin Chen wrote: > > > > IOTLB efficiency will suffer though when splitting 1p -> 2v while > > > invalidation performance will suffer when joining 2p -> 1v. > > > > I think the invalidation efficiency is actually solvable. So, > > basically viommu_invalidate would receive a whole batch of cmds > > and dispatch them to different pSMMUs (nested_domains/devices). > > We already have a vdev_id table for devices, yet we just need a > > new vasid table for nested_domains. Right? > > You can't know the ASID usage of the hypervisor from the VM, unless > you also inspect the CD table memory in the guest. That seems like > something we should try hard to avoid. Actually, even now as we put a dispatcher in VMM, VMM still does decode the CD table to link ASID to s1_hwpt. Otherwise, it could only broadcast a TLBI cmd to all pSMMUs. Doing in the other way by moving it to the kernel, we'd just need a pair of new ioctls and use them when VMM traps CFGI_CD cmds, so kernel driver instead of VMM user driver manages the links between ASIDs to nested domains. Either a master ASID or SVA ASIDs can be linked to the same nested_domain that's allocated per vSTE. > > With that being said, it would make the kernel design a bit more > > complicated. And the VMM still has to separate the commands for > > passthrough devices (HW iotlb) from commands for emulated devices > > (emulated iotlb), unless we further split the topology at the VM > > level to have a dedicated vSMMU for all passthrough devices -- > > then VMM could just forward its entire cmdq to the kernel without > > deciphering every command (likely?). > > I would not include the emulated devices in a shared SMMU.. For the > same reason, we should try hard to avoid inspecting the page table > memory. I wouldn't like the idea of attaching emulated devices to a shared vSMMU. Yet, mind elaborating why this would inspect the page table memory? Or do you mean we should avoid VMM inspecting user tables? > If a viommu is needed for emulated then virtio-iommu may be more > appropriate.. > > That said I'm sure someone will want to do this, so as long as it is > possible in the VMM, as slow as it may be, then it is fine. Eric hasn't replied my previous query regarding how to design this, yet I guess the same. And looks like Intel is doing so for emulated devices, since there is only one intel_iommu instance in a VM. Thanks Nicolin