Re: [RFC PATCH 0/11] Rework gfn_to_pfn_cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2021-11-15 at 19:50 +0100, Paolo Bonzini wrote:
> On 11/15/21 17:47, David Woodhouse wrote:
> > I need to move it *back*  to
> > invalidate_range_start() where I had it before, if I want to let it
> > wait for vCPUs to exit. Which means... that the cache 'refresh' call
> > must wait until the mmu_notifier_count reaches zero? Am I allowed to do
> > that, and make the "There can be only one waiter" comment in
> > kvm_mmu_notifier_invalidate_range_end() no longer true?
> 
> You can also update the cache while taking the mmu_lock for read, and 
> retry if mmu_notifier_retry_hva tells you to do so.  Looking at the 
> scenario from commit e649b3f0188 you would have:
> 
>        (Invalidator) kvm_mmu_notifier_invalidate_range_start()
>        (Invalidator) write_lock(mmu_lock)
>        (Invalidator) increment mmu_notifier_count
>        (Invalidator) write_unlock(mmu_lock)
>        (Invalidator) request KVM_REQ_APIC_PAGE_RELOAD
>        (KVM VCPU) vcpu_enter_guest()
>        (KVM VCPU) kvm_vcpu_reload_apic_access_page()
>     +  (KVM VCPU) read_lock(mmu_lock)
>     +  (KVM VCPU) mmu_notifier_retry_hva()
>     +  (KVM VCPU) read_unlock(mmu_lock)
>     +  (KVM VCPU) retry! (mmu_notifier_count>1)
>        (Invalidator) actually unmap page
>     +  (Invalidator) kvm_mmu_notifier_invalidate_range_end()
>     +  (Invalidator) write_lock(mmu_lock)
>     +  (Invalidator) decrement mmu_notifier_count
>     +  (Invalidator) write_unlock(mmu_lock)
>     +  (KVM VCPU) vcpu_enter_guest()
>     +  (KVM VCPU) kvm_vcpu_reload_apic_access_page()
>     +  (KVM VCPU) mmu_notifier_retry_hva()
> 
> Changing mn_memslots_update_rcuwait to a waitq (and renaming it to 
> mn_invalidate_waitq) is of course also a possibility.

I do think I'll go for a waitq but let's start *really* simple to make
sure I've got the basics right.... does this look vaguely sensible?

It returns -EAGAIN and lets the caller retry; I started with a 'goto'
but didn't have a sane exit condition. In fact, I *still* don't have a
sane exit condition for callers like evtchn_set_fn().

I'm actually tempted to split the caches into two lists
(kvm->guest_gpc_list, kvm->atomic_gpc_list) and invalidate only the
*former* from invalidate_range_start(), with these -EAGAIN semantics.
The atomic ones can stay precisely as they were in the series I already
sent since there's no need for them ever to have to spin/wait as long
as they're invalidated in the invalidate_range() MMU notifier.

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index d1187b051203..2d76c09e460c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -151,6 +151,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_UNBLOCK           2
 #define KVM_REQ_UNHALT            3
 #define KVM_REQ_VM_DEAD           (4 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
+#define KVM_REQ_GPC_INVALIDATE    (5 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQUEST_ARCH_BASE     8
 
 #define KVM_ARCH_REQ_FLAGS(nr, flags) ({ \
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7382aa45d5e8..9bc3162ba650 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -468,8 +468,6 @@ static void kvm_mmu_notifier_invalidate_range(struct mmu_notifier *mn,
 	struct kvm *kvm = mmu_notifier_to_kvm(mn);
 	int idx;
 
-	gfn_to_pfn_cache_invalidate(kvm, start, end, false);
-
 	idx = srcu_read_lock(&kvm->srcu);
 	kvm_arch_mmu_notifier_invalidate_range(kvm, start, end);
 	srcu_read_unlock(&kvm->srcu, idx);
@@ -689,6 +687,8 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
 	kvm->mn_active_invalidate_count++;
 	spin_unlock(&kvm->mn_invalidate_lock);
 
+	gfn_to_pfn_cache_invalidate(kvm, range->start, range->end, hva_range.may_block);
+
 	__kvm_handle_hva_range(kvm, &hva_range);
 
 	return 0;
@@ -2674,7 +2674,6 @@ static void gfn_to_pfn_cache_invalidate(struct kvm *kvm, unsigned long start,
 	}
 	spin_unlock(&kvm->gpc_lock);
 
-#if 0
 	unsigned int req = KVM_REQ_GPC_INVALIDATE;
 
 	/*
@@ -2690,7 +2689,6 @@ static void gfn_to_pfn_cache_invalidate(struct kvm *kvm, unsigned long start,
 	} else if (wake_vcpus) {
 		called = kvm_make_vcpus_request_mask(kvm, req, vcpu_bitmap);
 	}
-#endif
 	WARN_ON_ONCE(called && !may_block);
 }
 
@@ -2767,6 +2765,8 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 	if (!old_valid || old_uhva != gpc->uhva) {
 		unsigned long uhva = gpc->uhva;
 		void *new_khva = NULL;
+		unsigned long mmu_seq;
+		int retry;
 
 		/* Placeholders for "hva is valid but not yet mapped" */
 		gpc->pfn = KVM_PFN_ERR_FAULT;
@@ -2775,10 +2775,20 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 
 		write_unlock_irq(&gpc->lock);
 
+		mmu_seq = kvm->mmu_notifier_seq;
+		smp_rmb();
+
 		new_pfn = hva_to_pfn(uhva, false, NULL, true, NULL);
 		if (is_error_noslot_pfn(new_pfn))
 			ret = -EFAULT;
-		else if (gpc->kernel_map) {
+		else {
+			read_lock(&kvm->mmu_lock);
+			retry = mmu_notifier_retry_hva(kvm, mmu_seq, uhva);
+			read_unlock(&kvm->mmu_lock);
+			if (retry)
+				ret = -EAGAIN; // or goto the mmu_seq setting bit to retry?
+		}
+		if (!ret && gpc->kernel_map) {
 			if (new_pfn == old_pfn) {
 				new_khva = (void *)((unsigned long)old_khva - page_offset);
 				old_pfn = KVM_PFN_ERR_FAULT;

Attachment: smime.p7s
Description: S/MIME cryptographic signature


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux