On Wed, 31 Jul 2019 at 20:55, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > On 31/07/19 13:39, Wanpeng Li wrote: > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > index ed061d8..12f2c91 100644 > > --- a/virt/kvm/kvm_main.c > > +++ b/virt/kvm/kvm_main.c > > @@ -2506,7 +2506,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode) > > continue; > > if (vcpu == me) > > continue; > > - if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu)) > > + if (READ_ONCE(vcpu->preempted) && swait_active(&vcpu->wq)) > > continue; > > if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode && > > !kvm_arch_vcpu_in_kernel(vcpu)) > > > > This cannot work. swait_active means you are waiting, so you cannot be > involuntarily preempted. > > The problem here is simply that kvm_vcpu_has_events is being called > without holding the lock. So kvm_arch_vcpu_runnable is okay, it's the > implementation that's wrong. > > Just rename the existing function to just vcpu_runnable and make a new > arch callback kvm_arch_dy_runnable. kvm_arch_dy_runnable can be > conservative and only returns true for a subset of events, in particular > for x86 it can check: > > - vcpu->arch.pv.pv_unhalted > > - KVM_REQ_NMI or KVM_REQ_SMI or KVM_REQ_EVENT > > - PIR.ON if APICv is set > > Ultimately, all variables accessed in kvm_arch_dy_runnable should be > accessed with READ_ONCE or atomic_read. > > And for all architectures, kvm_vcpu_on_spin should check > list_empty_careful(&vcpu->async_pf.done) > > It's okay if your patch renames the function in non-x86 architectures, > leaving the fix to maintainers. So, let's CC Marc and Christian since > ARM and s390 have pretty complex kvm_arch_vcpu_runnable as well. Ok, just sent patch to do this. Regards, Wanpeng Li