Re: [PATCH v2 3/3] KVM: Fix leak vCPU's VMCS value into other pCPU

Paolo Bonzini <pbonzini@xxxxxxxxxx> · Wed, 31 Jul 2019 14:55:33 +0200

On 31/07/19 13:39, Wanpeng Li wrote:
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index ed061d8..12f2c91 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2506,7 +2506,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
>  				continue;
>  			if (vcpu == me)
>  				continue;
> -			if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
> +			if (READ_ONCE(vcpu->preempted) && swait_active(&vcpu->wq))
>  				continue;
>  			if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode &&
>  				!kvm_arch_vcpu_in_kernel(vcpu))
> 

This cannot work.  swait_active means you are waiting, so you cannot be
involuntarily preempted.

The problem here is simply that kvm_vcpu_has_events is being called
without holding the lock.  So kvm_arch_vcpu_runnable is okay, it's the
implementation that's wrong.

Just rename the existing function to just vcpu_runnable and make a new
arch callback kvm_arch_dy_runnable.   kvm_arch_dy_runnable can be
conservative and only returns true for a subset of events, in particular
for x86 it can check:

- vcpu->arch.pv.pv_unhalted

- KVM_REQ_NMI or KVM_REQ_SMI or KVM_REQ_EVENT

- PIR.ON if APICv is set

Ultimately, all variables accessed in kvm_arch_dy_runnable should be
accessed with READ_ONCE or atomic_read.

And for all architectures, kvm_vcpu_on_spin should check
list_empty_careful(&vcpu->async_pf.done)

It's okay if your patch renames the function in non-x86 architectures,
leaving the fix to maintainers.  So, let's CC Marc and Christian since
ARM and s390 have pretty complex kvm_arch_vcpu_runnable as well.

Paolo