On Wed, Oct 02, 2024, Sean Christopherson wrote: > On Wed, Oct 02, 2024, Markku Ahvenjärvi wrote: > > Hi Sean, > > > > > On Fri, Sep 20, 2024, Markku Ahvenjärvi wrote: > > > > Running certain hypervisors under KVM on VMX suffered L1 hangs after > > > > launching a nested guest. The external interrupts were not processed on > > > > vmlaunch/vmresume due to stale VPPR, and L2 guest would resume without > > > > allowing L1 hypervisor to process the events. > > > > > > > > The patch ensures VPPR to be updated when checking for pending > > > > interrupts. > > > > > > This is architecturally incorrect, PPR isn't refreshed at VM-Enter. > > > > I looked into this and found the following from Intel manual: > > > > "30.1.3 PPR Virtualization > > > > The processor performs PPR virtualization in response to the following > > operations: (1) VM entry; (2) TPR virtualization; and (3) EOI virtualization. > > > > ..." > > > > The section "27.3.2.5 Updating Non-Register State" further explains the VM > > enter: > > > > "If the “virtual-interrupt delivery” VM-execution control is 1, VM entry loads > > the values of RVI and SVI from the guest interrupt-status field in the VMCS > > (see Section 25.4.2). After doing so, the logical processor first causes PPR > > virtualization (Section 30.1.3) and then evaluates pending virtual interrupts > > (Section 30.2.1). If a virtual interrupt is recognized, it may be delivered in > > VMX non-root operation immediately after VM entry (including any specified > > event injection) completes; ..." > > > > According to that, PPR is supposed to be refreshed at VM-Enter, or am I > > missing something here? > > Huh, I missed that. It makes sense I guess; VM-Enter processes pending virtual > interrupts, so it stands that VM-Enter would refresh PPR as well. > > Ugh, and looking again, KVM refreshes PPR every time it checks for a pending > interrupt, including the VM-Enter case (via kvm_apic_has_interrupt()) when nested > posted interrupts are in use: > > /* Emulate processing of posted interrupts on VM-Enter. */ > if (nested_cpu_has_posted_intr(vmcs12) && > kvm_apic_has_interrupt(vcpu) == vmx->nested.posted_intr_nv) { > vmx->nested.pi_pending = true; > kvm_make_request(KVM_REQ_EVENT, vcpu); > kvm_apic_clear_irr(vcpu, vmx->nested.posted_intr_nv); > } > > I'm still curious as to what's different about your setup, but certainly not > curious enough to hold up a fix. Actually, none of the above is even relevant. PPR virtualization in the nested VM-Enter case would be for _L2's_ vPRR, not L1's. And that virtualization is performed by hardware (vmcs02 has the correct RVI, SVI, and vAPIC information for L2). Which means my initial instinct that KVM is missing a PPR update somewhere is likely correct. That said, I'm inclined to go with the below fix anyways, because KVM updates PPR _constantly_, e.g. every time kvm_vcpu_has_events() is invoked with IRQs enabled. Which means that trying to avoid a PPR update on VM-Enter just to be pedantically accurate is ridiculous. > So, for an immediate fix, I _think_ we can do: > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > index a8e7bc04d9bf..784b61c9810b 100644 > --- a/arch/x86/kvm/vmx/nested.c > +++ b/arch/x86/kvm/vmx/nested.c > @@ -3593,7 +3593,8 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, > * effectively unblock various events, e.g. INIT/SIPI cause VM-Exit > * unconditionally. > */ > - if (unlikely(evaluate_pending_interrupts)) > + if (unlikely(evaluate_pending_interrupts) || > + kvm_apic_has_interrupt(vcpu)) > kvm_make_request(KVM_REQ_EVENT, vcpu); > > /*