On Fri, Jun 07, 2024 at 02:19:21PM -0700, Nicolin Chen wrote: > > IOTLB efficiency will suffer though when splitting 1p -> 2v while > > invalidation performance will suffer when joining 2p -> 1v. > > I think the invalidation efficiency is actually solvable. So, > basically viommu_invalidate would receive a whole batch of cmds > and dispatch them to different pSMMUs (nested_domains/devices). > We already have a vdev_id table for devices, yet we just need a > new vasid table for nested_domains. Right? You can't know the ASID usage of the hypervisor from the VM, unless you also inspect the CD table memory in the guest. That seems like something we should try hard to avoid. > With that being said, it would make the kernel design a bit more > complicated. And the VMM still has to separate the commands for > passthrough devices (HW iotlb) from commands for emulated devices > (emulated iotlb), unless we further split the topology at the VM > level to have a dedicated vSMMU for all passthrough devices -- > then VMM could just forward its entire cmdq to the kernel without > deciphering every command (likely?). I would not include the emulated devices in a shared SMMU.. For the same reason, we should try hard to avoid inspecting the page table memory. If a viommu is needed for emulated then virtio-iommu may be more appropriate.. That said I'm sure someone will want to do this, so as long as it is possible in the VMM, as slow as it may be, then it is fine. Jason