Re: [PATCH] KVM: Boost vCPU candidiate in user mode which is delivering interrupt

Paolo Bonzini <pbonzini@xxxxxxxxxx> · Tue, 20 Apr 2021 09:22:51 +0200

On 20/04/21 08:08, Wanpeng Li wrote:
On Tue, 20 Apr 2021 at 14:02, Wanpeng Li <kernellwp@xxxxxxxxx> wrote:

On Tue, 20 Apr 2021 at 00:59, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:

On 19/04/21 18:32, Sean Christopherson wrote:
If false positives are a big concern, what about adding another pass to the loop
and only yielding to usermode vCPUs with interrupts in the second full pass?
I.e. give vCPUs that are already in kernel mode priority, and only yield to
handle an interrupt if there are no vCPUs in kernel mode.

kvm_arch_dy_runnable() pulls in pv_unhalted, which seems like a good thing.

pv_unhalted won't help if you're waiting for a kernel spinlock though,
would it?  Doing two passes (or looking for a "best" candidate that
prefers kernel mode vCPUs to user mode vCPUs waiting for an interrupt)
seems like the best choice overall.

How about something like this:

I was thinking of something simpler:

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 9b8e30dd5b9b..455c648f9adc 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3198,10 +3198,9 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
 {
 	struct kvm *kvm = me->kvm;
 	struct kvm_vcpu *vcpu;
-	int last_boosted_vcpu = me->kvm->last_boosted_vcpu;
 	int yielded = 0;
 	int try = 3;
-	int pass;
+	int pass, num_passes = 1;
 	int i;
 
 	kvm_vcpu_set_in_spin_loop(me, true);
@@ -3212,13 +3211,14 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
 	 * VCPU is holding the lock that we need and will release it.
 	 * We approximate round-robin by starting at the last boosted VCPU.
 	 */
-	for (pass = 0; pass < 2 && !yielded && try; pass++) {
-		kvm_for_each_vcpu(i, vcpu, kvm) {
-			if (!pass && i <= last_boosted_vcpu) {
-				i = last_boosted_vcpu;
-				continue;
-			} else if (pass && i > last_boosted_vcpu)
-				break;
+	for (pass = 0; pass < num_passes; pass++) {
+		int idx = me->kvm->last_boosted_vcpu;
+		int n = atomic_read(&kvm->online_vcpus);
+		for (i = 0; i < n; i++, idx++) {
+			if (idx == n)
+				idx = 0;
+
+			vcpu = kvm_get_vcpu(kvm, idx);
 			if (!READ_ONCE(vcpu->ready))
 				continue;
 			if (vcpu == me)
@@ -3226,23 +3226,36 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
 			if (rcuwait_active(&vcpu->wait) &&
 			    !vcpu_dy_runnable(vcpu))
 				continue;
-			if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode &&
-				!kvm_arch_vcpu_in_kernel(vcpu))
-				continue;
 			if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
 				continue;
 
+			if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode &&
+			    !kvm_arch_vcpu_in_kernel(vcpu)) {
+			    /*
+			     * A vCPU running in userspace can get to kernel mode via
+			     * an interrupt.  That's a worse choice than a CPU already
+			     * in kernel mode so only do it on a second pass.
+			     */
+			    if (!vcpu_dy_runnable(vcpu))
+				    continue;
+			    if (pass == 0) {
+				    num_passes = 2;
+				    continue;
+			    }
+			}
+
 			yielded = kvm_vcpu_yield_to(vcpu);
 			if (yielded > 0) {
 				kvm->last_boosted_vcpu = i;
-				break;
+				goto done;
 			} else if (yielded < 0) {
 				try--;
 				if (!try)
-					break;
+					goto done;
 			}
 		}
 	}
+done:
 	kvm_vcpu_set_in_spin_loop(me, false);
 
 	/* Ensure vcpu is not eligible during next spinloop */

Paolo