Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat)

"wangyanan (Y)" <wangyanan55@xxxxxxxxxx> · Fri, 18 Nov 2022 10:29:40 +0800

On 2022/11/17 1:19, David Matlack wrote:
On Tue, Nov 15, 2022 at 11:28:56AM +0800, wangyanan (Y) wrote:
Hi Sean, Paolo,

I recently also notice the behavior change of param halt_poll_ns.
Now it loses the ability to:
1) dynamically disable halt polling for all the running VMs
by `echo 0 > /sys`
2) dynamically adjust the halt polling interval for all the
running VMs by `echo * > /sys`

While in our cases, we usually use above two abilities, and
KVM_CAP_HALT_POLL is not used yet.
I think the right path forward is to make KVM_CAP_HALT_POLL a pure
override of halt_poll_ns, and restore the pre-existing behavior of
halt_poll_ns whenever KVM_CAP_HALT_POLL is not used. e.g. see the patch
below.
Agree with this.
kvm.halt_poll_ns serves like a legacy method to control halt polling
globally. Once KVM_CAP_HALT_POLL is used for a VM, it should
hold 100% responsibility to control on the VM, including disabling
the polling. This strategy helps to keep the two mechanisms
decoupled.
That will fix issues (1) and (2) above for any VM not using
KVM_CAP_HALT_POLL. If a VM is using KVM_CAP_HALT_POLL, it will ignore
all changes to halt_poll_ns. If we truly need a mechanism for admins to
disable halt-polling on VMs using KVM_CAP_HALT_POLL, we can introduce a
separate module parameter for that. But IMO, any setup that is
sophisticated enough to use KVM_CAP_HALT_POLL should also be able to use
KVM_CAP_HALT_POLL to disable halt polling.

If everyone is happy with this approach I can test and send a real patch
to the mailing list.

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e6e66c5e56f2..253ad055b6ad 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -788,6 +788,7 @@ struct kvm {
  	struct srcu_struct srcu;
  	struct srcu_struct irq_srcu;
  	pid_t userspace_pid;
+	bool override_halt_poll_ns;
  	unsigned int max_halt_poll_ns;
  	u32 dirty_ring_size;
  	bool vm_bugged;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 43bbe4fde078..479d0d0da0b5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1198,8 +1198,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
  			goto out_err_no_arch_destroy_vm;
  	}
  
-	kvm->max_halt_poll_ns = halt_poll_ns;
-
  	r = kvm_arch_init_vm(kvm, type);
  	if (r)
  		goto out_err_no_arch_destroy_vm;
@@ -3371,7 +3369,7 @@ void kvm_sigset_deactivate(struct kvm_vcpu *vcpu)
  	sigemptyset(&current->real_blocked);
  }
  
-static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
+static void grow_halt_poll_ns(struct kvm_vcpu *vcpu, unsigned int max)
  {
  	unsigned int old, val, grow, grow_start;
  
@@ -3385,8 +3383,8 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
  	if (val < grow_start)
  		val = grow_start;
  
-	if (val > vcpu->kvm->max_halt_poll_ns)
-		val = vcpu->kvm->max_halt_poll_ns;
+	if (val > max)
+		val = max;
  
  	vcpu->halt_poll_ns = val;
  out:
@@ -3501,10 +3499,17 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
  {
  	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
  	bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
+	unsigned int max_halt_poll_ns;
  	ktime_t start, cur, poll_end;
+	struct kvm *kvm = vcpu->kvm;
  	bool waited = false;
  	u64 halt_ns;
  
+	if (kvm->override_halt_poll_ns)
+		max_halt_poll_ns = kvm->max_halt_poll_ns;
+	else
+		max_halt_poll_ns = READ_ONCE(halt_poll_ns);
+
  	start = cur = poll_end = ktime_get();
  	if (do_halt_poll) {
  		ktime_t stop = ktime_add_ns(start, vcpu->halt_poll_ns);
@@ -3545,17 +3550,16 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
  	if (halt_poll_allowed) {
  		if (!vcpu_valid_wakeup(vcpu)) {
  			shrink_halt_poll_ns(vcpu);
-		} else if (vcpu->kvm->max_halt_poll_ns) {
+		} else if (max_halt_poll_ns) {
  			if (halt_ns <= vcpu->halt_poll_ns)
  				;
  			/* we had a long block, shrink polling */
-			else if (vcpu->halt_poll_ns &&
-				 halt_ns > vcpu->kvm->max_halt_poll_ns)
+			else if (vcpu->halt_poll_ns && halt_ns > max_halt_poll_ns)
  				shrink_halt_poll_ns(vcpu);
  			/* we had a short halt and our poll time is too small */
-			else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
-				 halt_ns < vcpu->kvm->max_halt_poll_ns)
-				grow_halt_poll_ns(vcpu);
+			else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
+				 halt_ns < max_halt_poll_ns)
+				grow_halt_poll_ns(vcpu, max_halt_poll_ns);
  		} else {
  			vcpu->halt_poll_ns = 0;
  		}
@@ -4588,6 +4592,7 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
  		if (cap->flags || cap->args[0] != (unsigned int)cap->args[0])
  			return -EINVAL;
  
+		kvm->override_halt_poll_ns = true;
  		kvm->max_halt_poll_ns = cap->args[0];
  		return 0;
  	}
Looks sensible to me overall.
I will look at the RFC series, thanks for your quick response.

Yanan
.
On 2021/9/28 1:33, Sean Christopherson wrote:
On Mon, Sep 27, 2021, Paolo Bonzini wrote:
On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger
<borntraeger@xxxxxxxxxx> wrote:
So I think there are two possibilities that makes sense:

* track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that
what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.
Yes, that's what I meant.  David pointed out that doesn't allow you to
disable halt polling altogether, but for that you can always ask each
VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I
don't know about Google's usecase, but mine was actually more about
using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!).
I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or
in-the-field mitigation if halt-polling is broken.  It'd be trivial to support, e.g.
Do we have any plan to repost the diff as a fix?
I would be very nice that this issue can be solved.

Besides, I think we may need some Doc for users to describe
how halt_poll_ns works with KVM_CAP_HALT_POLL, like
"Documentation/virt/guest-halt-polling.rst".
@@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
                  update_halt_poll_stats(vcpu, start, poll_end, !waited);

          if (halt_poll_allowed) {
+               max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns;
+               if (!max_halt_poll_ns || !halt_poll_ns)  <------ squish the max if halt_poll_ns==0
+                       max_halt_poll_ns = halt_poll_ns;
+
Does this mean that KVM_CAP_HALT_POLL will not be able to
disable halt polling for a VM individually when halt_poll_ns !=0?
                  if (!vcpu_valid_wakeup(vcpu)) {
                          shrink_halt_poll_ns(vcpu);
-               } else if (vcpu->kvm->max_halt_poll_ns) {
+               } else if (max_halt_poll_ns) {
                          if (halt_ns <= vcpu->halt_poll_ns)
                                  ;
                          /* we had a long block, shrink polling */
                          else if (vcpu->halt_poll_ns &&
-                                halt_ns > vcpu->kvm->max_halt_poll_ns)
+                                halt_ns > max_halt_poll_ns)
                                  shrink_halt_poll_ns(vcpu);
                          /* we had a short halt and our poll time is too small */
-                       else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
-                                halt_ns < vcpu->kvm->max_halt_poll_ns)
-                               grow_halt_poll_ns(vcpu);
+                       else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
+                                halt_ns < max_halt_poll_ns)
+                               grow_halt_poll_ns(vcpu, max_halt_poll_ns);
                  } else {
                          vcpu->halt_poll_ns = 0;
                  }
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
.
Thanks,
Yanan
.