[PATCH] KVM: move memslot invalidation later than possible failures

Yan Zhao <yan.y.zhao@xxxxxxxxx> · Tue, 8 Nov 2022 16:44:16 +0800

For memslot delete and move, kvm_invalidate_memslot() is required before
the real changes committed.
Besides swapping to an inactive slot, kvm_invalidate_memslot() will call
kvm_arch_flush_shadow_memslot() and further kvm_page_track_flush_slot() in
arch x86.
And according to the definition in kvm_page_track_notifier_node, users can
drop write-protection for the pages in the memory slot on receiving
.track_flush_slot.

However, if kvm_prepare_memory_region() fails, the later
kvm_activate_memslot() will only swap back the original slot, leaving
previous write protection not recovered.

This may not be a problem for kvm itself as a page tracker user, but may
cause problem to other page tracker users, e.g. kvmgt, whose
write-protected pages are removed from the write-protected list and not
added back.

So call kvm_prepare_memory_region first for meta data preparation before
the slot invalidation so as to avoid failure and recovery.

Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
---
 virt/kvm/kvm_main.c | 40 +++++++++++++++-------------------------
 1 file changed, 15 insertions(+), 25 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 25d7872b29c1..5f29011f432d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1827,45 +1827,35 @@ static int kvm_set_memslot(struct kvm *kvm,
 	 */
 	mutex_lock(&kvm->slots_arch_lock);
 
-	/*
-	 * Invalidate the old slot if it's being deleted or moved.  This is
-	 * done prior to actually deleting/moving the memslot to allow vCPUs to
-	 * continue running by ensuring there are no mappings or shadow pages
-	 * for the memslot when it is deleted/moved.  Without pre-invalidation
-	 * (and without a lock), a window would exist between effecting the
-	 * delete/move and committing the changes in arch code where KVM or a
-	 * guest could access a non-existent memslot.
-	 *
-	 * Modifications are done on a temporary, unreachable slot.  The old
-	 * slot needs to be preserved in case a later step fails and the
-	 * invalidation needs to be reverted.
-	 */
 	if (change == KVM_MR_DELETE || change == KVM_MR_MOVE) {
 		invalid_slot = kzalloc(sizeof(*invalid_slot), GFP_KERNEL_ACCOUNT);
 		if (!invalid_slot) {
 			mutex_unlock(&kvm->slots_arch_lock);
 			return -ENOMEM;
 		}
-		kvm_invalidate_memslot(kvm, old, invalid_slot);
 	}
 
 	r = kvm_prepare_memory_region(kvm, old, new, change);
 	if (r) {
-		/*
-		 * For DELETE/MOVE, revert the above INVALID change.  No
-		 * modifications required since the original slot was preserved
-		 * in the inactive slots.  Changing the active memslots also
-		 * release slots_arch_lock.
-		 */
-		if (change == KVM_MR_DELETE || change == KVM_MR_MOVE) {
-			kvm_activate_memslot(kvm, invalid_slot, old);
+		if (change == KVM_MR_DELETE || change == KVM_MR_MOVE)
 			kfree(invalid_slot);
-		} else {
-			mutex_unlock(&kvm->slots_arch_lock);
-		}
+
+		mutex_unlock(&kvm->slots_arch_lock);
 		return r;
 	}
 
+	/*
+	 * Invalidate the old slot if it's being deleted or moved.  This is
+	 * done prior to actually deleting/moving the memslot to allow vCPUs to
+	 * continue running by ensuring there are no mappings or shadow pages
+	 * for the memslot when it is deleted/moved.  Without pre-invalidation
+	 * (and without a lock), a window would exist between effecting the
+	 * delete/move and committing the changes in arch code where KVM or a
+	 * guest could access a non-existent memslot.
+	 */
+	if (change == KVM_MR_DELETE || change == KVM_MR_MOVE)
+		kvm_invalidate_memslot(kvm, old, invalid_slot);
+
 	/*
 	 * For DELETE and MOVE, the working slot is now active as the INVALID
 	 * version of the old slot.  MOVE is particularly special as it reuses
-- 
2.17.1