On Tue, Nov 12, 2024, Chao Gao wrote: > On Fri, Nov 01, 2024 at 12:14:47PM -0700, Sean Christopherson wrote: > >Move the handling of a nested posted interrupt notification that is > >unblocked by nested VM-Enter (unblocks L1 IRQs when ack-on-exit is enabled > >by L1) from VM-Enter emulation to vmx_check_nested_events(). To avoid a > >pointless forced immediate exit, i.e. to not regress IRQ delivery latency > >when a nested posted interrupt is pending at VM-Enter, block processing of > >the notification IRQ if and only if KVM must block _all_ events. Unlike > >injected events, KVM doesn't need to actually enter L2 before updating the > >vIRR and vmcs02.GUEST_INTR_STATUS, as the resulting L2 IRQ will be blocked > >by hardware itself, until VM-Enter to L2 completes. > > > >Note, very strictly speaking, moving the IRQ from L2's PIR to IRR before > >entering L2 is still technically wrong. But, practically speaking, only a > >userspace that is deliberately checking KVM_STATE_NESTED_RUN_PENDING > >against PIR and IRR can even notice; L2 will see architecturally correct > >behavior, as KVM ensure the VM-Enter is finished before doing anything > >that would effectively preempt the PIR=>IRR movement. > > In my understanding, L1 can notice some priority issue in some cases. e.g., > L1 enables NMI window VM-exit and enters L2 with a nested posted interrupt > notification. Assuming L2 doesn't block NMIs, then NMI window VM-exit should > happen immediately after nested VM-enter even before the nested posted > interrupt processing. > > Another case is the nested VM-enter may inject some events (i.e., > vmcs12->vm_entry_intr_info_field has a valid event). Event injection has > higher priority over external interrupt VM-exit. The event injection may > encounter EPT_VIOLATION which needs to be reflected to L1. In this case, > L1 is supposed to observe the EPT VIOLATION before the nested posted interrupt > processing. Hmm, right, L1 could also observe the PIR=>IRR movement. How about this? Note, very strictly speaking, moving the IRQ from L2's PIR to IRR before entering L2 is still technically wrong. But, practically speaking, only an L1 hypervisor or an L0 userspace that is deliberately checking event priority against PIR=>IRR processing can even notice; L2 will see architecturally correct behavior, as KVM ensures the VM-Enter is finished before doing anything that would effectively preempt the PIR=>IRR movement.