From: Thomas Prescher <thomas.prescher@xxxxxxxxxxxxxxxxxxxxx> This issue occurs when the kernel is interrupted by a signal while running a L2 guest. If the signal is meant to be delivered to the L0 VMM, and L0 updates CR4 for L1, i.e. when the VMM sets KVM_SYNC_X86_SREGS in kvm_run->kvm_dirty_regs, the kernel programs an incorrect read shadow value for L2's CR4. The result is that the guest can read a value for CR4 where bits from L1 have leaked into L2. We found this issue by running uXen [1] as L2 in VirtualBox/KVM [2]. The issue can also easily be reproduced in Qemu/KVM if we force a sreg sync on each call to KVM_RUN [3]. The issue can also be reproduced by running a L2 Windows 10. In the Windows case, CR4.VMXE leaks from L1 to L2 causing the OS to blue-screen with a kernel thread exception during TLB invalidation where the following code sequence triggers the issue: mov rax, cr4 <--- L2 reads CR4 with contents from L1 mov rcx, cr4 btc 0x7, rax <--- L2 toggles CR4.PGE mov cr4, rax <--- #GP because L2 writes CR4 with reserved bits set mov cr4, rcx The existing code seems to fixup CR4_READ_SHADOW after calling vmx_set_cr4 except in __set_sregs_common. While we could fix it there as well, it's easier to just handle it centrally. There might be a similar issue with CR0. [1] https://github.com/OpenXT/uxen [2] https://github.com/cyberus-technology/virtualbox-kvm [3] https://github.com/tpressure/qemu/commit/d64c9d5e76f3f3b747bea7653d677bd61e13aafe Signed-off-by: Julian Stecklina <julian.stecklina@xxxxxxxxxxxxxxxxxxxxx> Signed-off-by: Thomas Prescher <thomas.prescher@xxxxxxxxxxxxxxxxxxxxx> --- arch/x86/kvm/vmx/vmx.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 6780313914f8..0d4af00245f3 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3474,7 +3474,11 @@ void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) hw_cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE); } - vmcs_writel(CR4_READ_SHADOW, cr4); + if (is_guest_mode(vcpu)) + vmcs_writel(CR4_READ_SHADOW, nested_read_cr4(get_vmcs12(vcpu))); + else + vmcs_writel(CR4_READ_SHADOW, cr4); + vmcs_writel(GUEST_CR4, hw_cr4); if ((cr4 ^ old_cr4) & (X86_CR4_OSXSAVE | X86_CR4_PKE)) -- 2.43.2