Hi David,
On 2022/11/17 8:16, David Matlack wrote:
Obey kvm.halt_poll_ns in VMs not using KVM_CAP_HALT_POLL on every halt,
rather than just sampling the module parameter when the VM is first
s/first/firstly
created. This restore the original behavior of kvm.halt_poll_ns for VMs
s/restore/restores
that have not opted into KVM_CAP_HALT_POLL.
Notably, this change restores the ability for admins to disable or
change the maximum halt-polling time system wide for VMs not using
KVM_CAP_HALT_POLL.
Should we add more detailed comments about relationship
between KVM_CAP_HALT_POLL and kvm.halt_poll_ns in
Documentation/virt/kvm/api.rst? Something like:
"once KVM_CAP_HALT_POLL is used for a target VM, it will
completely ignores any future changes to kvm.halt_poll_ns..."
Reported-by: Christian Borntraeger <borntraeger@xxxxxxxxxx>
Fixes: acd05785e48c ("kvm: add capability for halt polling")
Signed-off-by: David Matlack <dmatlack@xxxxxxxxxx>
---
include/linux/kvm_host.h | 1 +
virt/kvm/kvm_main.c | 27 ++++++++++++++++++++++++---
2 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e6e66c5e56f2..253ad055b6ad 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -788,6 +788,7 @@ struct kvm {
struct srcu_struct srcu;
struct srcu_struct irq_srcu;
pid_t userspace_pid;
+ bool override_halt_poll_ns;
unsigned int max_halt_poll_ns;
u32 dirty_ring_size;
bool vm_bugged;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 78caf19608eb..7f73ce99bd0e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1198,8 +1198,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
goto out_err_no_arch_destroy_vm;
}
- kvm->max_halt_poll_ns = halt_poll_ns;
-
r = kvm_arch_init_vm(kvm, type);
if (r)
goto out_err_no_arch_destroy_vm;
@@ -3490,7 +3488,20 @@ static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
static unsigned int kvm_vcpu_max_halt_poll_ns(struct kvm_vcpu *vcpu)
{
- return READ_ONCE(vcpu->kvm->max_halt_poll_ns);
+ struct kvm *kvm = vcpu->kvm;
+
+ if (kvm->override_halt_poll_ns) {
+ /*
+ * Ensure kvm->max_halt_poll_ns is not read before
+ * kvm->override_halt_poll_ns.
+ *
+ * Pairs with the smp_wmb() when enabling KVM_CAP_HALT_POLL.
+ */
+ smp_rmb();
+ return READ_ONCE(kvm->max_halt_poll_ns);
+ }
+
+ return READ_ONCE(halt_poll_ns);
}
/*
@@ -4600,6 +4611,16 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
return -EINVAL;
kvm->max_halt_poll_ns = cap->args[0];
+
+ /*
+ * Ensure kvm->override_halt_poll_ns does not become visible
+ * before kvm->max_halt_poll_ns.
+ *
+ * Pairs with the smp_rmb() in kvm_vcpu_max_halt_poll_ns().
+ */
+ smp_wmb();
+ kvm->override_halt_poll_ns = true;
+
return 0;
}
case KVM_CAP_DIRTY_LOG_RING:
Looks good to me:
Reviewed-by: Yanan Wang <wangyanan55@xxxxxxxxxx>
Thanks,
Yanan