On Fri, Sep 20, 2024, Markku Ahvenjärvi wrote: > Running certain hypervisors under KVM on VMX suffered L1 hangs after > launching a nested guest. The external interrupts were not processed on > vmlaunch/vmresume due to stale VPPR, and L2 guest would resume without > allowing L1 hypervisor to process the events. > > The patch ensures VPPR to be updated when checking for pending > interrupts. This is architecturally incorrect, PPR isn't refreshed at VM-Enter. Aha! I wonder if the missing PPR update is due to the nested VM-Enter path directly clearing IRR when processing a posted interrupt. On top of https://github.com/kvm-x86/linux/tree/next, does this fix things? diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index a8e7bc04d9bf..a8255c6f0d51 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3731,7 +3731,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) kvm_apic_has_interrupt(vcpu) == vmx->nested.posted_intr_nv) { vmx->nested.pi_pending = true; kvm_make_request(KVM_REQ_EVENT, vcpu); - kvm_apic_clear_irr(vcpu, vmx->nested.posted_intr_nv); + kvm_apic_ack_interrupt(vcpu, vmx->nested.posted_intr_nv); } /* Hide L1D cache contents from the nested guest. */