[PATCH 1/1 v2] KVM: Reduce mmu_lock contention during dirty logging by cond_resched()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Takuya Yoshikawa <yoshikawa.takuya@xxxxxxxxxxxxx>

get_dirty_log() needs to hold mmu_lock during write protecting dirty
pages and this can be long when there are many dirty pages to protect.

As the guest can get faulted during that time, and of course other mmu
works can also happen, this may result in a severe latency problem which
would prevent the system from scaling.

This patch mitigates this by checking mmu_lock contention for every 8K
dirty pages we protect: write protecting this number of pages took about
one hundred microseconds on a usual Xeon server.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@xxxxxxxxxxxxx>
---
 v1->v2:
   Replaced spin_needbreak() with spin_is_contended() to make this work
     even when CONFIG_PREEMPT=no.
   Expanded the reasoning behind the check as a comment.
   Raised the value from 2048 to 8192 based on a new test result.

 arch/x86/include/asm/kvm_host.h |    6 +++---
 arch/x86/kvm/mmu.c              |   12 +++++++++---
 arch/x86/kvm/x86.c              |   22 +++++++++++++++++-----
 3 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 69e39bc..b249a8b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -716,9 +716,9 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
 
 int kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
 void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
-void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
-				     struct kvm_memory_slot *slot,
-				     gfn_t gfn_offset, unsigned long mask);
+int kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
+				    struct kvm_memory_slot *slot,
+				    gfn_t gfn_offset, unsigned long mask);
 void kvm_mmu_zap_all(struct kvm *kvm);
 unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
 void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 07424cf..c4abfc1 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1081,20 +1081,26 @@ static int __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp, int level
  *
  * Used when we do not need to care about huge page mappings: e.g. during dirty
  * logging we do not have any such mappings.
+ *
+ * Returns the number of pages protected by this.
  */
-void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
-				     struct kvm_memory_slot *slot,
-				     gfn_t gfn_offset, unsigned long mask)
+int kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
+				    struct kvm_memory_slot *slot,
+				    gfn_t gfn_offset, unsigned long mask)
 {
 	unsigned long *rmapp;
+	int nr_protected = 0;
 
 	while (mask) {
 		rmapp = &slot->rmap[gfn_offset + __ffs(mask)];
 		__rmap_write_protect(kvm, rmapp, PT_PAGE_TABLE_LEVEL);
+		++nr_protected;
 
 		/* clear the first set bit */
 		mask &= mask - 1;
 	}
+
+	return nr_protected;
 }
 
 static int rmap_write_protect(struct kvm *kvm, u64 gfn)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4de705c..aaae01a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3092,7 +3092,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 	unsigned long n, i;
 	unsigned long *dirty_bitmap;
 	unsigned long *dirty_bitmap_buffer;
-	bool is_dirty = false;
+	int nr_protected = 0;
 
 	mutex_lock(&kvm->slots_lock);
 
@@ -3121,15 +3121,27 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 		if (!dirty_bitmap[i])
 			continue;
 
-		is_dirty = true;
-
 		mask = xchg(&dirty_bitmap[i], 0);
 		dirty_bitmap_buffer[i] = mask;
 
 		offset = i * BITS_PER_LONG;
-		kvm_mmu_write_protect_pt_masked(kvm, memslot, offset, mask);
+		nr_protected += kvm_mmu_write_protect_pt_masked(kvm, memslot,
+								offset, mask);
+		if (nr_protected > 8192) {
+			/*
+			 * Write protecting many pages takes long time, so we
+			 * check mmu_lock contention for avoiding ms latency.
+			 */
+			if (need_resched() || spin_is_contended(&kvm->mmu_lock)) {
+				kvm_flush_remote_tlbs(kvm);
+				spin_unlock(&kvm->mmu_lock);
+				cond_resched();
+				spin_lock(&kvm->mmu_lock);
+			}
+			nr_protected = 0;
+		}
 	}
-	if (is_dirty)
+	if (nr_protected)
 		kvm_flush_remote_tlbs(kvm);
 
 	spin_unlock(&kvm->mmu_lock);
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux