On Thu, Nov 7, 2019 at 3:12 AM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > On 07/11/19 06:48, Dan Williams wrote: > >> How do mmu notifiers get held off by page references and does that > >> machinery work with ZONE_DEVICE? Why is this not a concern for the > >> VM_IO and VM_PFNMAP case? > > Put another way, I see no protection against truncate/invalidate > > afforded by a page pin. If you need guarantees that the page remains > > valid in the VMA until KVM can install a mmu notifier that needs to > > happen under the mmap_sem as far as I can see. Otherwise gup just > > weakly asserts "this pinned page was valid in this vma at one point in > > time". > > The MMU notifier is installed before gup, so any invalidation will be > preceded by a call to the MMU notifier. In turn, > invalidate_range_start/end is called with mmap_sem held so there should > be no race. > > However, as Sean mentioned, early put_page of ZONE_DEVICE pages would be > racy, because we need to keep the reference between the gup and the last > time we use the corresponding struct page. If KVM is establishing the mmu_notifier before gup then there is nothing left to do with that ZONE_DEVICE page, so I'm struggling to see what further qualification of kvm_is_reserved_pfn() buys the implementation. However, if you're attracted to the explicitness of Sean's approach can I at least ask for comments asserting that KVM knows it already holds a reference on that page so the is_zone_device_page() usage is safe? David and I are otherwise trying to reduce is_zone_device_page() to easy to audit "obviously safe" cases and converting the others with additional synchronization.