On Thu, 2021-01-07 at 18:51 +0100, Paolo Bonzini wrote: > On 07/01/21 18:00, Sean Christopherson wrote: > > Ugh, I assume this is due to one of the "premature" nested_ops->check_events() > > calls that are necessitated by the event mess? I'm guessing kvm_vcpu_running() > > is the culprit? > > > > If my assumption is correct, this bug affects nVMX as well. > > Yes, though it may be latent. For SVM it was until we started > allocating svm->nested on demand. > > > Rather than clear the request blindly on any nested VM-Exit, what > > about something like the following? > > I think your patch is overkill, KVM_REQ_GET_NESTED_STATE_PAGES is only > set from KVM_SET_NESTED_STATE so it cannot happen while the VM runs. Note that I didn't include the same fix for VMX becasue it uses a separate vmcs for guest which has its own msr bitmap so in theory this shouldn't be needed, but it won't hurt. I'll test indeed if canceling the KVM_REQ_GET_NESTED_STATE_PAGES on VMX makes any difference on VMX in regard to nested migration crashes I am seeing. Best regards, Maxim Levitsky > > Something like this is small enough and works well. > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c > index a622e63739b4..cb4c6ee10029 100644 > --- a/arch/x86/kvm/svm/nested.c > +++ b/arch/x86/kvm/svm/nested.c > @@ -595,6 +596,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm) > svm->nested.vmcb12_gpa = 0; > WARN_ON_ONCE(svm->nested.nested_run_pending); > > + kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, &svm->vcpu); > + > /* in case we halted in L2 */ > svm->vcpu.arch.mp_state = KVM_MP_STATE_RUNNABLE; > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > index e2f26564a12d..0fbb46990dfc 100644 > --- a/arch/x86/kvm/vmx/nested.c > +++ b/arch/x86/kvm/vmx/nested.c > @@ -4442,6 +4442,8 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 > vm_exit_reason, > /* trying to cancel vmlaunch/vmresume is a bug */ > WARN_ON_ONCE(vmx->nested.nested_run_pending); > > + kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); > + > /* Service the TLB flush request for L2 before switching to L1. */ > if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu)) > kvm_vcpu_flush_tlb_current(vcpu); > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 3f7c1fc7a3ce..b7e784b5489c 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -8789,7 +8789,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) > > if (kvm_request_pending(vcpu)) { > if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) { > - if (unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu))) { > + if (WARN_ON_ONCE(!is_guest_mode(&svm->vcpu))) > + ; > + else if > (unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu))) { > r = 0; > goto out; > } >