On 08/02/2018 13:09, Liran Alon wrote: > ----- pbonzini@xxxxxxxxxx wrote: >> On 08/02/2018 06:13, Chao Gao wrote: >>> Because virtual interrupt delivery may wake L2 vcpu, if VID is >>> enabled, do the same thing -- don't halt L2. >> >> This second part seems wrong to me, or at least overly general. >> Perhaps you mean if RVI>0? > > I would first recommend to split this commit. > The first commit should handle only the case of vectoring VM entry. > It should also specify in commit message it is based on Intel SDM 26.6.2 Activity State: > ("If the VM entry is vectoring, the logical processor is in the active state after VM entry.") > That part in code seems correct to me. I agree. > The second commit seems wrong to me as-well. > (I would also mention here it is based on Intel SDM 26.6.5 > Interrupt-Window Exiting and Virtual-Interrupt Delivery: > "These events wake the logical processor if it just entered the HLT state because of a VM entry") > > Paolo, I think that your suggestion is not sufficient as well. > Consider the case that APIC's TPR blocks interrupt specified in RVI. That's true. It should be RVI>PPR. > Otherwise, kvm_vcpu_halt() will change mp_state to KVM_MP_STATE_HALTED. > Eventually, vcpu_run() will call vcpu_block() which will reach kvm_vcpu_has_events(). > That function is responsible for checking if there is any pending interrupts. > Including, pending interrupts as a result of VID enabled and RVI>0 > (While also taking into account the APIC's TPR). > The logic that checks for pending interrupts is kvm_cpu_has_interrupt() > which eventually reach apic_has_interrupt_for_ppr(). > If APICv is enabled, apic_has_interrupt_for_ppr() will call vmx_sync_pir_to_irr() > which calls vmx_hwapic_irr_update(). > > However, max_irr returned to apic_has_interrupt_for_ppr() does not consider the interrupt > pending in RVI. Which I think is the real bug to fix here. > In the non-nested case, RVI can never be larger than max_irr because that is how L0 KVM manages RVI. > However, in the nested case, L1 can set RVI in VMCS arbitrary > (we just copy GUEST_INTR_STATUS from vmcs01 into vmcs02). > > A possible patch to fix this is to change vmx_hwapic_irr_update() such that > if is_guest_mode(vcpu)==true, we should return max(max_irr, rvi) and return > that value into apic_has_interrupt_for_ppr(). > Need to verify that it doesn't break other flows but I think it makes sense. > What do you think? Yeah, I think it makes sense though I'd need to look a lot more at arch/x86/kvm/lapic.c and arch/x86/kvm/vmx.c to turn that into a patch! Paolo