Patch "KVM: nVMX: Do not clear CR3 load/store exiting bits if L1 wants 'em" has been added to the 5.10-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    KVM: nVMX: Do not clear CR3 load/store exiting bits if L1 wants 'em

to the 5.10-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     kvm-nvmx-do-not-clear-cr3-load-store-exiting-bits-if.patch
and it can be found in the queue-5.10 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 9e34e4ee64d135591ddaa46f3ca2390cd0666c7f
Author: Sean Christopherson <seanjc@xxxxxxxxxx>
Date:   Tue Jul 13 09:33:02 2021 -0700

    KVM: nVMX: Do not clear CR3 load/store exiting bits if L1 wants 'em
    
    [ Upstream commit 470750b3425513b9f63f176e564e63e0e7998afc ]
    
    Keep CR3 load/store exiting enable as needed when running L2 in order to
    honor L1's desires.  This fixes a largely theoretical bug where L1 could
    intercept CR3 but not CR0.PG and end up not getting the desired CR3 exits
    when L2 enables paging.  In other words, the existing !is_paging() check
    inadvertantly handles the normal case for L2 where vmx_set_cr0() is
    called during VM-Enter, which is guaranteed to run with paging enabled,
    and thus will never clear the bits.
    
    Removing the !is_paging() check will also allow future consolidation and
    cleanup of the related code.  From a performance perspective, this is
    all a nop, as the VMCS controls shadow will optimize away the VMWRITE
    when the controls are in the desired state.
    
    Add a comment explaining why CR3 is intercepted, with a big disclaimer
    about not querying the old CR3.  Because vmx_set_cr0() is used for flows
    that are not directly tied to MOV CR3, e.g. vCPU RESET/INIT and nested
    VM-Enter, it's possible that is_paging() is not synchronized with CR3
    load/store exiting.  This is actually guaranteed in the current code, as
    KVM starts with CR3 interception disabled.  Obviously that can be fixed,
    but there's no good reason to play whack-a-mole, and it tends to end
    poorly, e.g. descriptor table exiting for UMIP emulation attempted to be
    precise in the past and ended up botching the interception toggling.
    
    Fixes: fe3ef05c7572 ("KVM: nVMX: Prepare vmcs02 from vmcs01 and vmcs12")
    Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
    Message-Id: <20210713163324.627647-25-seanjc@xxxxxxxxxx>
    Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
    Stable-dep-of: c4abd7352023 ("KVM: VMX: Don't fudge CR0 and CR4 for restricted L2 guest")
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b9abe08c9d590..ca51df0df3f94 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3063,10 +3063,14 @@ void ept_save_pdptrs(struct kvm_vcpu *vcpu)
 	kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR);
 }
 
+#define CR3_EXITING_BITS (CPU_BASED_CR3_LOAD_EXITING | \
+			  CPU_BASED_CR3_STORE_EXITING)
+
 void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	unsigned long hw_cr0;
+	u32 tmp;
 
 	hw_cr0 = (cr0 & ~KVM_VM_CR0_ALWAYS_OFF);
 	if (is_unrestricted_guest(vcpu))
@@ -3093,18 +3097,42 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 #endif
 
 	if (enable_ept && !is_unrestricted_guest(vcpu)) {
+		/*
+		 * Ensure KVM has an up-to-date snapshot of the guest's CR3.  If
+		 * the below code _enables_ CR3 exiting, vmx_cache_reg() will
+		 * (correctly) stop reading vmcs.GUEST_CR3 because it thinks
+		 * KVM's CR3 is installed.
+		 */
 		if (!kvm_register_is_available(vcpu, VCPU_EXREG_CR3))
 			vmx_cache_reg(vcpu, VCPU_EXREG_CR3);
+
+		/*
+		 * When running with EPT but not unrestricted guest, KVM must
+		 * intercept CR3 accesses when paging is _disabled_.  This is
+		 * necessary because restricted guests can't actually run with
+		 * paging disabled, and so KVM stuffs its own CR3 in order to
+		 * run the guest when identity mapped page tables.
+		 *
+		 * Do _NOT_ check the old CR0.PG, e.g. to optimize away the
+		 * update, it may be stale with respect to CR3 interception,
+		 * e.g. after nested VM-Enter.
+		 *
+		 * Lastly, honor L1's desires, i.e. intercept CR3 loads and/or
+		 * stores to forward them to L1, even if KVM does not need to
+		 * intercept them to preserve its identity mapped page tables.
+		 */
 		if (!(cr0 & X86_CR0_PG)) {
-			/* From paging/starting to nonpaging */
-			exec_controls_setbit(vmx, CPU_BASED_CR3_LOAD_EXITING |
-						  CPU_BASED_CR3_STORE_EXITING);
-			vcpu->arch.cr0 = cr0;
-			vmx_set_cr4(vcpu, kvm_read_cr4(vcpu));
-		} else if (!is_paging(vcpu)) {
-			/* From nonpaging to paging */
-			exec_controls_clearbit(vmx, CPU_BASED_CR3_LOAD_EXITING |
-						    CPU_BASED_CR3_STORE_EXITING);
+			exec_controls_setbit(vmx, CR3_EXITING_BITS);
+		} else if (!is_guest_mode(vcpu)) {
+			exec_controls_clearbit(vmx, CR3_EXITING_BITS);
+		} else {
+			tmp = exec_controls_get(vmx);
+			tmp &= ~CR3_EXITING_BITS;
+			tmp |= get_vmcs12(vcpu)->cpu_based_vm_exec_control & CR3_EXITING_BITS;
+			exec_controls_set(vmx, tmp);
+		}
+
+		if (!is_paging(vcpu) != !(cr0 & X86_CR0_PG)) {
 			vcpu->arch.cr0 = cr0;
 			vmx_set_cr4(vcpu, kvm_read_cr4(vcpu));
 		}



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux