On Wed, Jul 06, 2022, Maxim Levitsky wrote: > On Tue, 2022-06-14 at 20:47 +0000, Sean Christopherson wrote: > > Clear mtf_pending on nested VM-Exit instead of handling the clear on a > > case-by-case basis in vmx_check_nested_events(). The pending MTF should > > rever survive nested VM-Exit, as it is a property of KVM's run of the > ^^ typo: never > > Also it is not clear what the 'case by case' means. > > I see that the vmx_check_nested_events always clears it unless nested run is pending > or we re-inject an event. Those two "unless ..." are the "cases". The point I'm trying to make in the changelog is that there's no need for any conditional logic whatsoever. > > @@ -3927,6 +3919,9 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) > > clear_bit(KVM_APIC_INIT, &apic->pending_events); > > if (vcpu->arch.mp_state != KVM_MP_STATE_INIT_RECEIVED) > > nested_vmx_vmexit(vcpu, EXIT_REASON_INIT_SIGNAL, 0, 0); > > + > > + /* MTF is discarded if the vCPU is in WFS. */ > > + vmx->nested.mtf_pending = false; > > return 0; > > I guess MTF should also be discarded if we enter SMM, and I see that > VMX also enter SMM with a pseudo VM exit (in vmx_enter_smm) which > will clear the MTF. Good. No, a pending MTF should be preserved across SMI. It's not a regression because KVM incorrectly prioritizes MTF (and trap-like #DBs) over SMI (and because if KVM did prioritize SMI, the existing code would also drop the pending MTF). Note, this isn't the only flaw that needs to be addressed in order to correctly prioritize SMIs, e.g. KVM_{G,S}ET_NESTED_STATE would need to save/restore a pending MTF if the vCPU is in SMM after an SMI that arrived while L2 was active. Tangentially related, KVM's pseudo VM-Exit on SMI emulation is completely wrong[*]. [*] https://lore.kernel.org/all/Yobt1XwOfb5M6Dfa@xxxxxxxxxx