On 09/11/2017 19:27, Liran Alon wrote: > Consider the following scenario: > 1. CPU A calls vmx_deliver_nested_posted_interrupt() to send an IPI > to CPU B via virtual posted-interrupt mechanism. > 2. CPU B is currently executing L2 guest. > 3. vmx_deliver_nested_posted_interrupt() calls > kvm_vcpu_trigger_posted_interrupt() which will note that > vcpu->mode == IN_GUEST_MODE. > 4. Assume that before CPU A sends the physical POSTED_INTR_NESTED_VECTOR > IPI, CPU B exits from L2 to L0 during event-delivery > (valid IDT-vectoring-info). > 5. CPU A now sends the physical IPI. The IPI is received in host and > it's handler (smp_kvm_posted_intr_nested_ipi()) does nothing. > 6. Assume that before CPU A sets pi_pending=true and KVM_REQ_EVENT, > CPU B continues to run in L0 and reach vcpu_enter_guest(). As > KVM_REQ_EVENT is not set yet, vcpu_enter_guest() will continue and resume > L2 guest. > 7. At this point, CPU A sets pi_pending=true and KVM_REQ_EVENT but > it's too late! CPU B already entered L2 and KVM_REQ_EVENT will only be > consumed at next L2 entry! The bug is real (great debugging!) but I think the fix is wrong. The basic issue is that we're not kicking the VCPU, so this should also fix it: /* the PIR and ON have been set by L1. */ if (!kvm_vcpu_trigger_posted_interrupt(vcpu, true)) { /* * If a posted intr is not recognized by hardware, * we will accomplish it in the next vmentry. */ vmx->nested.pi_pending = true; kvm_make_request(KVM_REQ_EVENT, vcpu); kvm_vcpu_kick(vcpu); } See the comments around the setting of IN_GUEST_MODE, introduced by commit b95234c84004 ("kvm: x86: do not use KVM_REQ_EVENT for APICv interrupt injection", 2017-02-15). Even though nested PI must use KVM_REQ_EVENT, the reasoning behind the ordering of cli and vcpu->mode = IN_GUEST_MODE should hold in both cases. > Another scenario to consider: > 1. CPU A calls vmx_deliver_nested_posted_interrupt() to send an IPI > to CPU B via virtual posted-interrupt mechanism. > 2. Assume that before CPU A calls kvm_vcpu_trigger_posted_interrupt(), > CPU B is at L0 and is about to resume into L2. Further assume that it is > in vcpu_enter_guest() after check for KVM_REQ_EVENT. > 3. At this point, CPU A calls kvm_vcpu_trigger_posted_interrupt() which > will note that vcpu->mode != IN_GUEST_MODE. Therefore, do nothing and > return false. Then, will set pi_pending=true and KVM_REQ_EVENT. > 4. Now CPU B continue and resumes into L2 guest without processing > the posted-interrupt until next L2 entry! Adding a kick should fix this as well. Paolo