On Tue, 2022-06-14 at 20:47 +0000, Sean Christopherson wrote: > Clear mtf_pending on nested VM-Exit instead of handling the clear on a > case-by-case basis in vmx_check_nested_events(). The pending MTF should > rever survive nested VM-Exit, as it is a property of KVM's run of the ^^ typo: never Also it is not clear what the 'case by case' means. I see that the vmx_check_nested_events always clears it unless nested run is pending or we re-inject an event. > current L2, i.e. should never affect the next L2 run by L1. In practice, > this is likely a nop as getting to L1 with nested_run_pending is > impossible, and KVM doesn't correctly handle morphing a pending exception > that occurs on a prior injected exception (need for re-injected exception > being the other case where MTF isn't cleared). However, KVM will > hopefully soon correctly deal with a pending exception on top of an > injected exception. > > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> > --- > arch/x86/kvm/vmx/nested.c | 16 +++++++--------- > 1 file changed, 7 insertions(+), 9 deletions(-) > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > index d080bfca16ef..7b644513c82b 100644 > --- a/arch/x86/kvm/vmx/nested.c > +++ b/arch/x86/kvm/vmx/nested.c > @@ -3909,16 +3909,8 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) > unsigned long exit_qual; > bool block_nested_events = > vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu); > - bool mtf_pending = vmx->nested.mtf_pending; > struct kvm_lapic *apic = vcpu->arch.apic; > > - /* > - * Clear the MTF state. If a higher priority VM-exit is delivered first, > - * this state is discarded. > - */ > - if (!block_nested_events) > - vmx->nested.mtf_pending = false; > - > if (lapic_in_kernel(vcpu) && > test_bit(KVM_APIC_INIT, &apic->pending_events)) { > if (block_nested_events) > @@ -3927,6 +3919,9 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) > clear_bit(KVM_APIC_INIT, &apic->pending_events); > if (vcpu->arch.mp_state != KVM_MP_STATE_INIT_RECEIVED) > nested_vmx_vmexit(vcpu, EXIT_REASON_INIT_SIGNAL, 0, 0); > + > + /* MTF is discarded if the vCPU is in WFS. */ > + vmx->nested.mtf_pending = false; > return 0; I guess MTF should also be discarded if we enter SMM, and I see that VMX also enter SMM with a pseudo VM exit (in vmx_enter_smm) which will clear the MTF. Good. > } > > @@ -3964,7 +3959,7 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) > return 0; > } > > - if (mtf_pending) { > + if (vmx->nested.mtf_pending) { > if (block_nested_events) > return -EBUSY; > nested_vmx_update_pending_dbg(vcpu); > @@ -4562,6 +4557,9 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason, > struct vcpu_vmx *vmx = to_vmx(vcpu); > struct vmcs12 *vmcs12 = get_vmcs12(vcpu); > > + /* Pending MTF traps are discarded on VM-Exit. */ > + vmx->nested.mtf_pending = false; > + > /* trying to cancel vmlaunch/vmresume is a bug */ > WARN_ON_ONCE(vmx->nested.nested_run_pending); > Reviewed-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx> Best regards, Maxim Levitsky