[RFC] KVM MMU: improve large munmap efficiency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Flush the shadow MMU instead of iterating over each host VA when doing
a large invalidate range callback.

The previous code is O(N) in the number of virtual pages being
invalidated, while holding both the MMU spinlock and the mmap_sem.
Large unmaps can cause significant delay, during which the process is
unkillable.  Worse, all page allocation could be delayed if there's
enough memory pressure that mmu_shrink gets called.

Signed-off-by: Eric Northup <digitaleric@xxxxxxxxxx>

---

We have seen delays of over 30 seconds doing a large (128GB) unmap.

It'd be nicer to check if the amount of work to be done by the entire
flush is less than the work to be done iterating over each HVA page,
but that information isn't currently available to the arch-
independent part of KVM.

Better ideas would be most welcome ;-)


Tested by attaching a debugger to a running qemu w/kvm and running
"call munmap(0, 1UL << 46)".

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7287bf5..9fe303a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -61,6 +61,8 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/kvm.h>

+#define MMU_NOTIFIER_FLUSH_THRESHOLD_PAGES	(1024u*1024u*1024u)
+
 MODULE_AUTHOR("Qumranet");
 MODULE_LICENSE("GPL");

@@ -332,8 +334,12 @@ static void
kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
 	 * count is also read inside the mmu_lock critical section.
 	 */
 	kvm->mmu_notifier_count++;
-	for (; start < end; start += PAGE_SIZE)
-		need_tlb_flush |= kvm_unmap_hva(kvm, start);
+	if (end - start < MMU_NOTIFIER_FLUSH_THRESHOLD_PAGES)
+		for (; start < end; start += PAGE_SIZE)
+			need_tlb_flush |= kvm_unmap_hva(kvm, start);
+	else
+		kvm_arch_flush_shadow(kvm);
+
 	need_tlb_flush |= kvm->tlbs_dirty;
 	spin_unlock(&kvm->mmu_lock);
 	srcu_read_unlock(&kvm->srcu, idx);
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux