On Tue, Aug 08, 2023 at 11:32:37AM -0300, Jason Gunthorpe wrote: .... > > This really needs to be fixed in the primary MMU and not require any direct > > involvement from secondary MMUs, e.g. the mmu_notifier invalidation itself needs > > to be skipped. > > This likely has the same issue you just described, we don't know if it > can be skipped until we iterate over the PTEs and by then it is too > late to invoke the notifier. Maybe some kind of abort and restart The problem is that KVM currently performs the zap in handler of .invalidate_range_start(), so before abort in mm, KVM has done the zap in secondary MMU. Or, could we move the zap in KVM side to handler of .invalidate_range_end() only for MMU_NOTIFY_PROTECTION_VMA and MMU_NOTIFIER_RANGE_NUMA? Then, in mm side, we could do the abort and update the range to contain only successful subrange .invalidate_range_end(). Is that acceptable? > scheme could work? >