Patch "KVM: x86: Fix lapic timer interrupt lost after loading a snapshot." has been added to the 6.1-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    KVM: x86: Fix lapic timer interrupt lost after loading a snapshot.

to the 6.1-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     kvm-x86-fix-lapic-timer-interrupt-lost-after-loading.patch
and it can be found in the queue-6.1 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 165753f3821befe8c41a5ef6e28ec5477224625b
Author: Haitao Shan <hshan@xxxxxxxxxx>
Date:   Tue Sep 12 16:55:45 2023 -0700

    KVM: x86: Fix lapic timer interrupt lost after loading a snapshot.
    
    [ Upstream commit 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2 ]
    
    When running android emulator (which is based on QEMU 2.12) on
    certain Intel hosts with kernel version 6.3-rc1 or above, guest
    will freeze after loading a snapshot. This is almost 100%
    reproducible. By default, the android emulator will use snapshot
    to speed up the next launching of the same android guest. So
    this breaks the android emulator badly.
    
    I tested QEMU 8.0.4 from Debian 12 with an Ubuntu 22.04 guest by
    running command "loadvm" after "savevm". The same issue is
    observed. At the same time, none of our AMD platforms is impacted.
    More experiments show that loading the KVM module with
    "enable_apicv=false" can workaround it.
    
    The issue started to show up after commit 8e6ed96cdd50 ("KVM: x86:
    fire timer when it is migrated and expired, and in oneshot mode").
    However, as is pointed out by Sean Christopherson, it is introduced
    by commit 967235d32032 ("KVM: vmx: clear pending interrupts on
    KVM_SET_LAPIC"). commit 8e6ed96cdd50 ("KVM: x86: fire timer when
    it is migrated and expired, and in oneshot mode") just makes it
    easier to hit the issue.
    
    Having both commits, the oneshot lapic timer gets fired immediately
    inside the KVM_SET_LAPIC call when loading the snapshot. On Intel
    platforms with APIC virtualization and posted interrupt processing,
    this eventually leads to setting the corresponding PIR bit. However,
    the whole PIR bits get cleared later in the same KVM_SET_LAPIC call
    by apicv_post_state_restore. This leads to timer interrupt lost.
    
    The fix is to move vmx_apicv_post_state_restore to the beginning of
    the KVM_SET_LAPIC call and rename to vmx_apicv_pre_state_restore.
    What vmx_apicv_post_state_restore does is actually clearing any
    former apicv state and this behavior is more suitable to carry out
    in the beginning.
    
    Fixes: 967235d32032 ("KVM: vmx: clear pending interrupts on KVM_SET_LAPIC")
    Cc: stable@xxxxxxxxxxxxxxx
    Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx>
    Signed-off-by: Haitao Shan <hshan@xxxxxxxxxx>
    Link: https://lore.kernel.org/r/20230913000215.478387-1-hshan@xxxxxxxxxx
    Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 2c6698aa218b1..abc07d0045897 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -106,6 +106,7 @@ KVM_X86_OP_OPTIONAL(vcpu_blocking)
 KVM_X86_OP_OPTIONAL(vcpu_unblocking)
 KVM_X86_OP_OPTIONAL(pi_update_irte)
 KVM_X86_OP_OPTIONAL(pi_start_assignment)
+KVM_X86_OP_OPTIONAL(apicv_pre_state_restore)
 KVM_X86_OP_OPTIONAL(apicv_post_state_restore)
 KVM_X86_OP_OPTIONAL_RET0(dy_apicv_has_pending_interrupt)
 KVM_X86_OP_OPTIONAL(set_hv_timer)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c1dcaa3d2d6eb..dfcdcafe3a2cd 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1603,6 +1603,7 @@ struct kvm_x86_ops {
 	int (*pi_update_irte)(struct kvm *kvm, unsigned int host_irq,
 			      uint32_t guest_irq, bool set);
 	void (*pi_start_assignment)(struct kvm *kvm);
+	void (*apicv_pre_state_restore)(struct kvm_vcpu *vcpu);
 	void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu);
 	bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu);
 
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 4dba0a84ba2f3..edcf45e312b99 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2446,6 +2446,8 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
 	u64 msr_val;
 	int i;
 
+	static_call_cond(kvm_x86_apicv_pre_state_restore)(vcpu);
+
 	if (!init_event) {
 		msr_val = APIC_DEFAULT_PHYS_BASE | MSR_IA32_APICBASE_ENABLE;
 		if (kvm_vcpu_is_reset_bsp(vcpu))
@@ -2757,6 +2759,8 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
 	struct kvm_lapic *apic = vcpu->arch.apic;
 	int r;
 
+	static_call_cond(kvm_x86_apicv_pre_state_restore)(vcpu);
+
 	kvm_lapic_set_base(vcpu, vcpu->arch.apic_base);
 	/* set SPIV separately to get count of SW disabled APICs right */
 	apic_set_spiv(apic, *((u32 *)(s->regs + APIC_SPIV)));
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 31a10d774df6d..98d732b9418f1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6799,7 +6799,7 @@ static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
 	vmcs_write64(EOI_EXIT_BITMAP3, eoi_exit_bitmap[3]);
 }
 
-static void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu)
+static void vmx_apicv_pre_state_restore(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
@@ -8172,7 +8172,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
 	.refresh_apicv_exec_ctrl = vmx_refresh_apicv_exec_ctrl,
 	.load_eoi_exitmap = vmx_load_eoi_exitmap,
-	.apicv_post_state_restore = vmx_apicv_post_state_restore,
+	.apicv_pre_state_restore = vmx_apicv_pre_state_restore,
 	.check_apicv_inhibit_reasons = vmx_check_apicv_inhibit_reasons,
 	.hwapic_irr_update = vmx_hwapic_irr_update,
 	.hwapic_isr_update = vmx_hwapic_isr_update,



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux