This is a note to let you know that I've just added the patch titled KVM: nVMX: Unconditionally clear mtf_pending on nested VM-Exit to the 6.0-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: kvm-nvmx-unconditionally-clear-mtf_pending-on-nested.patch and it can be found in the queue-6.0 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit 9960eda0640025a4b547fa5ca741bbb2ac8dc0c4 Author: Sean Christopherson <seanjc@xxxxxxxxxx> Date: Tue Aug 30 23:15:58 2022 +0000 KVM: nVMX: Unconditionally clear mtf_pending on nested VM-Exit [ Upstream commit 593a5c2e3c12a2f65967739267093255c47e9fe0 ] Clear mtf_pending on nested VM-Exit instead of handling the clear on a case-by-case basis in vmx_check_nested_events(). The pending MTF should never survive nested VM-Exit, as it is a property of KVM's run of the current L2, i.e. should never affect the next L2 run by L1. In practice, this is likely a nop as getting to L1 with nested_run_pending is impossible, and KVM doesn't correctly handle morphing a pending exception that occurs on a prior injected exception (need for re-injected exception being the other case where MTF isn't cleared). However, KVM will hopefully soon correctly deal with a pending exception on top of an injected exception. Add a TODO to document that KVM has an inversion priority bug between SMIs and MTF (and trap-like #DBS), and that KVM also doesn't properly save/restore MTF across SMI/RSM. Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> Reviewed-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx> Link: https://lore.kernel.org/r/20220830231614.3580124-12-seanjc@xxxxxxxxxx Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> Stable-dep-of: 7709aba8f716 ("KVM: x86: Morph pending exceptions to pending VM-Exits at queue time") Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index c06c25fb9cbe..0aa40ea496a8 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3910,16 +3910,8 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) unsigned long exit_qual; bool block_nested_events = vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu); - bool mtf_pending = vmx->nested.mtf_pending; struct kvm_lapic *apic = vcpu->arch.apic; - /* - * Clear the MTF state. If a higher priority VM-exit is delivered first, - * this state is discarded. - */ - if (!block_nested_events) - vmx->nested.mtf_pending = false; - if (lapic_in_kernel(vcpu) && test_bit(KVM_APIC_INIT, &apic->pending_events)) { if (block_nested_events) @@ -3928,6 +3920,9 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) clear_bit(KVM_APIC_INIT, &apic->pending_events); if (vcpu->arch.mp_state != KVM_MP_STATE_INIT_RECEIVED) nested_vmx_vmexit(vcpu, EXIT_REASON_INIT_SIGNAL, 0, 0); + + /* MTF is discarded if the vCPU is in WFS. */ + vmx->nested.mtf_pending = false; return 0; } @@ -3950,6 +3945,11 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) * fault-like exceptions, TSS T flag #DB (not emulated by KVM, but * could theoretically come in from userspace), and ICEBP (INT1). * + * TODO: SMIs have higher priority than MTF and trap-like #DBs (except + * for TSS T flag #DBs). KVM also doesn't save/restore pending MTF + * across SMI/RSM as it should; that needs to be addressed in order to + * prioritize SMI over MTF and trap-like #DBs. + * * Note that only a pending nested run can block a pending exception. * Otherwise an injected NMI/interrupt should either be * lost or delivered to the nested hypervisor in the IDT_VECTORING_INFO, @@ -3965,7 +3965,7 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) return 0; } - if (mtf_pending) { + if (vmx->nested.mtf_pending) { if (block_nested_events) return -EBUSY; nested_vmx_update_pending_dbg(vcpu); @@ -4562,6 +4562,9 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason, struct vcpu_vmx *vmx = to_vmx(vcpu); struct vmcs12 *vmcs12 = get_vmcs12(vcpu); + /* Pending MTF traps are discarded on VM-Exit. */ + vmx->nested.mtf_pending = false; + /* trying to cancel vmlaunch/vmresume is a bug */ WARN_ON_ONCE(vmx->nested.nested_run_pending);