On Wed, Nov 22, 2023 at 04:58:24AM +0000, Tian, Kevin wrote: > As Yi/Baolu discussed there is an issue in intel-iommu driver which > incorrectly skips devtlb invalidation in the guest with the assumption > that the host combines iotlb/devtlb invalidation together. This is > incorrect and should be fixed. Yes, this seems quite problematic - you guys will have to think of something and decide what kind of backward compat you want :( Maybe the viommu driver can observe the guest and if it sees an ATC invalidation assume it is non-buggy, until seen it can do a combined flush. > But what I was talking about earlier is about the uAPI between > viommu and iommu driver. I don't see a need of having separate > invalidation cmds for each, as I'm not sure what the user can > expect in the window when iotlb and devtlb are out of sync. If the guest is always issuing the device invalidation then I don't see too much point in suppressing it in the kernel. Just forward it naturally. > then we just define hwpt 'cache' invalidation in vtd always refers to > both iotlb and devtlb. Then viommu just needs to call invalidation > uapi once when emulating virtual iotlb invalidation descriptor > while emulating the following devtlb invalidation descriptor > as a nop. In principle ATC and IOMMU TLB invalidations should not always be linked. Any scenario that allows devices to share an IOTLB cache tag requires fewer IOMMU TLB invalidations than ATC invalidations. I like the view of this invalidation interface as reflecting the actual HW and not trying to be smarter an real HW. I'm fully expecting that Intel will adopt an direct-DMA flush queue like SMMU and AMD have already done as a performance optimization. In this world it makes no sense that the behavior of the direct DMA queue and driver mediated queue would be different. Jason