On 31/07/19 13:39, Wanpeng Li wrote: > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index ed061d8..12f2c91 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -2506,7 +2506,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode) > continue; > if (vcpu == me) > continue; > - if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu)) > + if (READ_ONCE(vcpu->preempted) && swait_active(&vcpu->wq)) > continue; > if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode && > !kvm_arch_vcpu_in_kernel(vcpu)) > This cannot work. swait_active means you are waiting, so you cannot be involuntarily preempted. The problem here is simply that kvm_vcpu_has_events is being called without holding the lock. So kvm_arch_vcpu_runnable is okay, it's the implementation that's wrong. Just rename the existing function to just vcpu_runnable and make a new arch callback kvm_arch_dy_runnable. kvm_arch_dy_runnable can be conservative and only returns true for a subset of events, in particular for x86 it can check: - vcpu->arch.pv.pv_unhalted - KVM_REQ_NMI or KVM_REQ_SMI or KVM_REQ_EVENT - PIR.ON if APICv is set Ultimately, all variables accessed in kvm_arch_dy_runnable should be accessed with READ_ONCE or atomic_read. And for all architectures, kvm_vcpu_on_spin should check list_empty_careful(&vcpu->async_pf.done) It's okay if your patch renames the function in non-x86 architectures, leaving the fix to maintainers. So, let's CC Marc and Christian since ARM and s390 have pretty complex kvm_arch_vcpu_runnable as well. Paolo