RE: iommufd dirty page logging overview

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Tian, Kevin
> Sent: Saturday, March 19, 2022 4:15 PM
> 
> 
> Let me make it more specific by taking vIOMMU as an example.
> No nesting i.e. Qemu generates a merged mapping for GIOVA->HPA
> via iommufd.
> 
> iommufd unmap is caused when emulating virtual iotlb invalidation
> request, *after* the guest iommu driver clears the guest I/O page
> table for the specified GIOVA range.
> 
> The dirty bits recorded by the device is around the dma addresses
> programmed by the guest, i.e. GIOVA.
> 
> Now if qemu pulls dirty bits from vfio device after iommufd unmap,
> how would qemu even know the corresponding PFN/VA for dirty
> GFNs given the guest I/O mapping has been cleared?
> 

Thinking more the real problem is not related to *before* vs. *after*
thing. :/ If Qemu itself doesn't maintain a virtual iotlb (large enough
to duplicate all valid mappings in guest I/O page table) and ensure
the cached mappings for the unmapped range is not zapped before
the dirty bitmap for that range is digested, the whole dirty tracking
is just broken in this scenario no matter which approach is used and
whether bitmap is retrieved before or after the iommufd unmap,
given guest mappings for dirtied GIOVAs in the unmapped range
already disappear at that point thus the path to find GIOVA->GPA->HVA
is just broken.

I roughly recalled a gap in Qemu viotlb was discussed when dirty
bitmap was added to vfio unmap. At that time Qemu's viotlb was
like a normal iotlb i.e. only caching mappings due to walking guest
page table for emulating DMA from non-vfio devices in Qemu. That
is definitely inadequate for aforementioned purpose.

But I don't know whether this gap has been fixed now.

there is no such concern with dpdk or VM w/o vIOMMU since the
iova address space is managed by host userspace which has intrinsic
knowledge about IOVA<->HVA even after iommufd unmap.

this is also fine with hardware nesting. The hardware ensures all
stage-1 activities converged on the dirty bits of stage-2 IOPTEs.
So the userspace can just ignore stage-1 and just collects dirty 
bitmap associated with stage-2.

Thanks
Kevin




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux