On Thu, Aug 09, 2018 at 12:23:43PM +0300, Michael S. Tsirkin wrote: > On Wed, Aug 08, 2018 at 11:45:43AM +0800, Peter Xu wrote: > > On Wed, Aug 08, 2018 at 12:58:32AM +0300, Michael S. Tsirkin wrote: > > > At least with VTD, it seems entirely possible to change e.g. a PMD > > > atomically to point to a different set of PTEs, then flush. > > > That will allow removing memory at high granularity for > > > an arbitrary device without mdev or PASID dependency. > > > > My understanding is that the guest driver should prohibit this kind of > > operation (say, modifying PMD). > > Interesting. Which part of the VTD spec prohibits this? > > > Actually I don't see how it can > > happen in Linux if the kernel drivers always call the IOMMU API since > > there are only map/unmap APIs rather than this atomic-modify API. > > It could happen with a non-Linux guest which might have a different API. > > > The thing is that IMHO it's the guest driver's responsibility to make > > sure the pages will never be used by the device before it removes the > > entry (including modifying the PMD since that actually removes all the > > entries on the old PMD). > > If you switch PMDs atomically from one set of valid PTEs to another, > then flush, then as far as I could see it just works in the hardware > VTD, but not in the emulated VTD. So that's a difference in > behaviour. Maybe we are lucky and no one does that. Yes, but AFAICT that's also the best we can have now since the userspace QEMU (or say, the VT-d emulation code) cannot really modify a real PMD that the hardware uses - it can only call the VFIO APIs, and finally it boils down again to the host kernel IOMMU APIs to do map or unmap only. So it's a impossible task until we provide such an interface through the whole IOMMU/VFIO/... stack just like what you have discussed in the other thread. Thanks, -- Peter Xu