> On 30 Aug 2018, at 14:50, Radim Krčmář <rkrcmar@xxxxxxxxxx> wrote: > > 2018-08-30 12:57+0300, Liran Alon: >> Consider the case L1 had a IRQ/NMI event until it executed >> VMLAUNCH/VMRESUME which wasn't delivered because it was disallowed >> (e.g. interrupts disabled). When L1 executes VMLAUNCH/VMRESUME, >> L0 needs to evaluate if this pending event should cause an exit from >> L2 to L1 or delivered directly to L2 (e.g. In case L1 don't intercept >> EXTERNAL_INTERRUPT). >> >> Usually this would be handled by L0 requesting a IRQ/NMI window >> by setting VMCS accordingly. However, this setting was done on >> VMCS01 and now VMCS02 is active instead. Thus, when L1 executes >> VMLAUNCH/VMRESUME we force L0 to perform pending event evaluation by >> requesting a KVM_REQ_EVENT. >> >> Note that above scenario exists when L1 KVM is about to enter L2 but >> requests an "immediate-exit". As in this case, L1 will >> disable-interrupts and then send a self-IPI before entering L2. >> >> Co-authored-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> >> Signed-off-by: Liran Alon <liran.alon@xxxxxxxxxx> >> --- >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> @@ -12574,6 +12577,25 @@ static int enter_vmx_non_root_mode(struct kvm_vcpu *vcpu, u32 *exit_qual) >> kvm_make_request(KVM_REQ_GET_VMCS12_PAGES, vcpu); >> } >> >> + /* >> + * If L1 had a pending IRQ/NMI until it executed >> + * VMLAUNCH/VMRESUME which wasn't delivered because it was >> + * disallowed (e.g. interrupts disabled), L0 needs to >> + * evaluate if this pending event should cause an exit from L2 >> + * to L1 or delivered directly to L2 (e.g. In case L1 don't >> + * intercept EXTERNAL_INTERRUPT). >> + * >> + * Usually this would be handled by L0 requesting a >> + * IRQ/NMI window by setting VMCS accordingly. However, >> + * this setting was done on VMCS01 and now VMCS02 is active >> + * instead. Thus, we force L0 to perform pending event >> + * evaluation by requesting a KVM_REQ_EVENT. >> + */ >> + if (vmcs01_cpu_exec_ctrl & >> + (CPU_BASED_VIRTUAL_INTR_PENDING | CPU_BASED_VIRTUAL_NMI_PENDING)) { > > Looks good, pending nested interrupts will be handled on the actual VM > entry, so we can ignore them here. Can you clarify? > >> + kvm_make_request(KVM_REQ_EVENT, vcpu); >> + } >> + >> /* >> * Note no nested_vmx_succeed or nested_vmx_fail here. At this point >> * we are no longer running L1, and VMLAUNCH/VMRESUME has not yet >> @@ -12702,7 +12724,8 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) >> * by event injection, halt vcpu. >> */ >> if ((vmcs12->guest_activity_state == GUEST_ACTIVITY_HLT) && >> - !(vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK)) { >> + !(vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK) && >> + !kvm_test_request(KVM_REQ_EVENT, vcpu)) { > > What is the purpose of this check? I think the event is recognized in > when checking for runnability and will resume the VCPU, That’s actually not accurate. Without this change, kvm_vcpu_halt() will set mp_state = KVM_MP_STATE_HALTED. Later, vcpu_run() will call kvm_vcpu_running() which will return false. (Note that vmx_check_nested_events() won’t exit to L1 because nested_run_pending is set). Therefore, vcpu_block() will be called which will call kvm_vcpu_check_block() -> kvm_arch_vcpu_runnable() -> kvm_vcpu_has_events() which does check for pending interrupts using kvm_cpu_has_interrupt(). However, because nested_run_pending is set, vmx_interrupt_allowed() returns false and therefore kvm_vcpu_has_events() doesn’t return “true” for the pending interrupt… Therefore, I believe this change is required. In addition, I have a unit-test for this scenario as-well and it fails without this change and passes with it. Will submit it soon (It’s now on internal review). -Liran > > thanks. > >> vmx->nested.nested_run_pending = 0; >> return kvm_vcpu_halt(vcpu); >> } >> -- >> 2.16.1 >>