Re: [PATCH] x86/kvm/vmx: Don't halt vcpu when L1 is injecting events to L2

Paolo Bonzini <pbonzini@xxxxxxxxxx> · Thu, 8 Feb 2018 14:53:53 +0100

On 08/02/2018 13:09, Liran Alon wrote:
> ----- pbonzini@xxxxxxxxxx wrote:
>> On 08/02/2018 06:13, Chao Gao wrote:
>>> Because virtual interrupt delivery may wake L2 vcpu, if VID is
>>> enabled, do the same thing -- don't halt L2.
>>
>> This second part seems wrong to me, or at least overly general. 
>> Perhaps you mean if RVI>0?
> 
> I would first recommend to split this commit.
> The first commit should handle only the case of vectoring VM entry.
> It should also specify in commit message it is based on Intel SDM 26.6.2 Activity State:
> ("If the VM entry is vectoring, the logical processor is in the active state after VM entry.")
> That part in code seems correct to me.

I agree.

> The second commit seems wrong to me as-well.
> (I would also mention here it is based on Intel SDM 26.6.5
> Interrupt-Window Exiting and Virtual-Interrupt Delivery:
> "These events wake the logical processor if it just entered the HLT state because of a VM entry")
> 
> Paolo, I think that your suggestion is not sufficient as well.
> Consider the case that APIC's TPR blocks interrupt specified in RVI.

That's true.  It should be RVI>PPR.

> Otherwise, kvm_vcpu_halt() will change mp_state to KVM_MP_STATE_HALTED.
> Eventually, vcpu_run() will call vcpu_block() which will reach kvm_vcpu_has_events().
> That function is responsible for checking if there is any pending interrupts.
> Including, pending interrupts as a result of VID enabled and RVI>0
> (While also taking into account the APIC's TPR).
> The logic that checks for pending interrupts is kvm_cpu_has_interrupt()
> which eventually reach apic_has_interrupt_for_ppr().
> If APICv is enabled, apic_has_interrupt_for_ppr() will call vmx_sync_pir_to_irr()
> which calls vmx_hwapic_irr_update().
> 
> However, max_irr returned to apic_has_interrupt_for_ppr() does not consider the interrupt
> pending in RVI. Which I think is the real bug to fix here.
> In the non-nested case, RVI can never be larger than max_irr because that is how L0 KVM manages RVI.
> However, in the nested case, L1 can set RVI in VMCS arbitrary
> (we just copy GUEST_INTR_STATUS from vmcs01 into vmcs02).
> 
> A possible patch to fix this is to change vmx_hwapic_irr_update() such that
> if is_guest_mode(vcpu)==true, we should return max(max_irr, rvi) and return
> that value into apic_has_interrupt_for_ppr().
> Need to verify that it doesn't break other flows but I think it makes sense.
> What do you think?

Yeah, I think it makes sense though I'd need to look a lot more at
arch/x86/kvm/lapic.c and arch/x86/kvm/vmx.c to turn that into a patch!

Paolo