Re: [PATCH RFC 04/19] iommu: Add an unmap API that returns dirtied IOPTEs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/30/22 06:12, Baolu Lu wrote:
> On 2022/4/29 05:09, Joao Martins wrote:
>> Today, the dirty state is lost and the page wouldn't be migrated to
>> destination potentially leading the guest into error.
>>
>> Add an unmap API that reads the dirty bit and sets it in the
>> user passed bitmap. This unmap iommu API tackles a potentially
>> racy update to the dirty bit *when* doing DMA on a iova that is
>> being unmapped at the same time.
>>
>> The new unmap_read_dirty/unmap_pages_read_dirty does not replace
>> the unmap pages, but rather only when explicit called with an dirty
>> bitmap data passed in.
>>
>> It could be said that the guest is buggy and rather than a special unmap
>> path tackling the theoretical race ... it would suffice fetching the
>> dirty bits (with GET_DIRTY_IOVA), and then unmap the IOVA.
> 
> I am not sure whether this API could solve the race.
> 

Yeah, it doesn't fully solve the race as DMA can still potentially
occuring until the IOMMU needs to rewalk page tables (i.e. after IOTLB flush).


> size_t iommu_unmap(struct iommu_domain *domain,
>                     unsigned long iova, size_t size)
> {
>          struct iommu_iotlb_gather iotlb_gather;
>          size_t ret;
> 
>          iommu_iotlb_gather_init(&iotlb_gather);
>          ret = __iommu_unmap(domain, iova, size, &iotlb_gather);
>          iommu_iotlb_sync(domain, &iotlb_gather);
> 
>          return ret;
> }
> 
> The PTEs are cleared before iotlb invalidation. What if a DMA write
> happens after PTE clearing and before the iotlb invalidation with the
> PTE happening to be cached?


Yeap. Jason/Robin also reiterated similarly.

To fully handle this we need to force the PTEs readonly, and check the dirty bit
after. So perhaps if we wanna go to the extent of fully stopping DMA -- which none
of unmap APIs ever guarantee -- we need more of an write-protects API that optionally
fetches the dirties. And then the unmap remains as is (prior to this series).

Now whether this race is worth solving isn't clear (bearing that solving the race will add
a lot of overhead), and git/mailing list archeology doesn't respond to that either if this
was ever useful in pratice :(

	Joao



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux