On Thu, Feb 08, 2018 at 04:09:36AM -0800, Liran Alon wrote: > >----- pbonzini@xxxxxxxxxx wrote: > >> On 08/02/2018 06:13, Chao Gao wrote: >> > Although L2 is in halt state, it will be in the active state after >> > VM entry if the VM entry is vectoring. Halting the vcpu here means >> > the event won't be injected to L2 and this decision isn't reported >> > to L1. Thus L0 drops an event that should be injected to L2. >> > >> > Because virtual interrupt delivery may wake L2 vcpu, if VID is >> enabled, >> > do the same thing -- don't halt L2. >> >> This second part seems wrong to me, or at least overly general. >> Perhaps >> you mean if RVI>0? >> >> Thanks, >> >> Paolo > >I would first recommend to split this commit. >The first commit should handle only the case of vectoring VM entry. >It should also specify in commit message it is based on Intel SDM 26.6.2 Activity State: >("If the VM entry is vectoring, the logical processor is in the active state after VM entry.") >That part in code seems correct to me. > >The second commit seems wrong to me as-well. >(I would also mention here it is based on Intel SDM 26.6.5 >Interrupt-Window Exiting and Virtual-Interrupt Delivery: >"These events wake the logical processor if it just entered the HLT state because of a VM entry") > >Paolo, I think that your suggestion is not sufficient as well. >Consider the case that APIC's TPR blocks interrupt specified in RVI. > >I think that we should just remove the check for VID from commit, >and instead fix another bug which I describe below. > >If lapic_in_kernel()==false, then APICv is not active and VID is not exposed to L1 >(In current KVM code. Jim already criticize that this is wrong.). > >Otherwise, kvm_vcpu_halt() will change mp_state to KVM_MP_STATE_HALTED. >Eventually, vcpu_run() will call vcpu_block() which will reach kvm_vcpu_has_events(). >That function is responsible for checking if there is any pending interrupts. >Including, pending interrupts as a result of VID enabled and RVI>0 The difference between checking VID and RVI here and in vcpu_run() is "nested_run_pending" is set for the former. Would it cause any problem if we don't set it? Thanks Chao >(While also taking into account the APIC's TPR). >The logic that checks for pending interrupts is kvm_cpu_has_interrupt() >which eventually reach apic_has_interrupt_for_ppr(). >If APICv is enabled, apic_has_interrupt_for_ppr() will call vmx_sync_pir_to_irr() >which calls vmx_hwapic_irr_update(). > >However, max_irr returned to apic_has_interrupt_for_ppr() does not consider the interrupt >pending in RVI. Which I think is the real bug to fix here. >In the non-nested case, RVI can never be larger than max_irr because that is how L0 KVM manages RVI. >However, in the nested case, L1 can set RVI in VMCS arbitrary >(we just copy GUEST_INTR_STATUS from vmcs01 into vmcs02). > >A possible patch to fix this is to change vmx_hwapic_irr_update() such that >if is_guest_mode(vcpu)==true, we should return max(max_irr, rvi) and return >that value into apic_has_interrupt_for_ppr(). >Need to verify that it doesn't break other flows but I think it makes sense. >What do you think? > >Regards, >-Liran > >> >> > Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx> >> > --- >> > arch/x86/kvm/vmx.c | 10 ++++++++-- >> > 1 file changed, 8 insertions(+), 2 deletions(-) >> > >> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> > index bb5b488..e1fe4e4 100644 >> > --- a/arch/x86/kvm/vmx.c >> > +++ b/arch/x86/kvm/vmx.c >> > @@ -10985,8 +10985,14 @@ static int nested_vmx_run(struct kvm_vcpu >> *vcpu, bool launch) >> > if (ret) >> > return ret; >> > >> > - if (vmcs12->guest_activity_state == GUEST_ACTIVITY_HLT) >> > - return kvm_vcpu_halt(vcpu); >> > + if (vmcs12->guest_activity_state == GUEST_ACTIVITY_HLT) { >> > + u32 intr_info = vmcs_read32(VM_ENTRY_INTR_INFO_FIELD); >> > + u32 exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL); >> > + >> > + if (!(intr_info & VECTORING_INFO_VALID_MASK) && >> > + !(exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY)) >> > + return kvm_vcpu_halt(vcpu); >> > + } >> >> > vmx->nested.nested_run_pending = 1; >> > >> >