On Thu, 2021-01-07 at 04:38 +0200, Maxim Levitsky wrote: > On Wed, 2021-01-06 at 10:17 -0800, Sean Christopherson wrote: > > On Wed, Jan 06, 2021, Maxim Levitsky wrote: > > > If migration happens while L2 entry with an injected event to L2 is pending, > > > we weren't including the event in the migration state and it would be > > > lost leading to L2 hang. > > > > But the injected event should still be in vmcs12 and KVM_STATE_NESTED_RUN_PENDING > > should be set in the migration state, i.e. it should naturally be copied to > > vmcs02 and thus (re)injected by vmx_set_nested_state(). Is nested_run_pending > > not set? Is the info in vmcs12 somehow lost? Or am I off in left field... > > You are completely right. > The injected event can be copied like that since the vmc(b|s)12 is migrated. > > We can safely disregard both these two patches and the parallel two patches for SVM. > I am almost sure that the real root cause of this bug was that we > weren't restoring the nested run pending flag, and I even > happened to fix this in this patch series. > > This is the trace of the bug (I removed the timestamps to make it easier to read) > > > kvm_exit: vcpu 0 reason vmrun rip 0xffffffffa0688ffa info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x00000000 error_code 0x00000000 > kvm_nested_vmrun: rip: 0xffffffffa0688ffa vmcb: 0x0000000103594000 nrip: 0xffffffff814b3b01 int_ctl: 0x01000001 event_inj: 0x80000036 npt: on > ^^^ this is the injection > kvm_nested_intercepts: cr_read: 0010 cr_write: 0010 excp: 00060042 intercepts: bc4c8027 00006e7f 00000000 > kvm_fpu: unload > kvm_userspace_exit: reason KVM_EXIT_INTR (10) > > ============================================================================ > migration happens here > ============================================================================ > > ... > kvm_async_pf_ready: token 0xffffffff gva 0 > kvm_apic_accept_irq: apicid 0 vec 243 (Fixed|edge) > > kvm_nested_intr_vmexit: rip: 0x000000000000fff0 > > ^^^^^ this is the nested vmexit that shouldn't have happened, since nested run is pending, > and which erased the eventinj field which was migrated correctly just like you say. > > kvm_nested_vmexit_inject: reason: interrupt ext_inf1: 0x0000000000000000 ext_inf2: 0x0000000000000000 ext_int: 0x00000000 ext_int_err: 0x00000000 > ... > > > We did notice that this vmexit had a wierd RIP and I > even explained this later to myself, > that this is the default RIP which we put to vmcb, > and it wasn't yet updated, since it updates just prior to vm entry. > > My test already survived about 170 iterations (usually it crashes after 20-40 iterations) > I am leaving the stress test running all night, let see if it survives. And after leaving it overnight, the test survived about 1000 iterations. Thanks again! Best regards, Maxim Levitstky > > V2 of the patches is on the way. > > Thanks again for the help! > > Best regards, > Maxim Levitsky > > > > > > Fix this by queueing the injected event in similar manner to how we queue > > > interrupted injections. > > > > > > This can be reproduced by running an IO intense task in L2, > > > and repeatedly migrating the L1. > > > > > > Suggested-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > > Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx> > > > --- > > > arch/x86/kvm/vmx/nested.c | 12 ++++++------ > > > 1 file changed, 6 insertions(+), 6 deletions(-) > > > > > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > > > index e2f26564a12de..2ea0bb14f385f 100644 > > > --- a/arch/x86/kvm/vmx/nested.c > > > +++ b/arch/x86/kvm/vmx/nested.c > > > @@ -2355,12 +2355,12 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) > > > * Interrupt/Exception Fields > > > */ > > > if (vmx->nested.nested_run_pending) { > > > - vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, > > > - vmcs12->vm_entry_intr_info_field); > > > - vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, > > > - vmcs12->vm_entry_exception_error_code); > > > - vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, > > > - vmcs12->vm_entry_instruction_len); > > > + if ((vmcs12->vm_entry_intr_info_field & VECTORING_INFO_VALID_MASK)) > > > + vmx_process_injected_event(&vmx->vcpu, > > > + vmcs12->vm_entry_intr_info_field, > > > + vmcs12->vm_entry_instruction_len, > > > + vmcs12->vm_entry_exception_error_code); > > > + > > > vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, > > > vmcs12->guest_interruptibility_info); > > > vmx->loaded_vmcs->nmi_known_unmasked = > > > -- > > > 2.26.2 > > >