Re: [PATCH 1/2] KVM: nVMX: fix CR4_READ_SHADOW when L0 updates CR4 during a signal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sean,

On Tue, 2024-04-16 at 07:35 -0700, Sean Christopherson wrote:
> On Tue, Apr 16, 2024, Julian Stecklina wrote:
> > From: Thomas Prescher <thomas.prescher@xxxxxxxxxxxxxxxxxxxxx>
> > 
> > This issue occurs when the kernel is interrupted by a signal while
> > running a L2 guest. If the signal is meant to be delivered to the
> > L0
> > VMM, and L0 updates CR4 for L1, i.e. when the VMM sets
> > KVM_SYNC_X86_SREGS in kvm_run->kvm_dirty_regs, the kernel programs
> > an
> > incorrect read shadow value for L2's CR4.
> > 
> > The result is that the guest can read a value for CR4 where bits
> > from
> > L1 have leaked into L2.
> 
> No, this is a userspace bug.  If L2 is active when userspace stuffs
> register state,
> then from KVM's perspective the incoming value is L2's value.  E.g.
> if userspace
> *wants* to update L2 CR4 for whatever reason, this patch would result
> in L2 getting
> a stale value, i.e. the value of CR4 at the time of VM-Enter.
> 
> And even if userspace wants to change L1, this patch is wrong, as KVM
> is writing
> vmcs02.GUEST_CR4, i.e. is clobbering the L2 CR4 that was programmed
> by L1, *and*
> is dropping the CR4 value that userspace wanted to stuff for L1.
> 
> To fix this, your userspace needs to either wait until L2 isn't
> active, or force
> the vCPU out of L2 (which isn't easy, but it's doable if absolutely
> necessary).

What you say makes sense. Is there any way for
userspace to detect whether L2 is currently active after
returning from KVM_RUN? I couldn't find anything in the official
documentation https://docs.kernel.org/virt/kvm/api.html

Can you point me into the right direction?

> 
> Pulling in a snippet from the initial bug report[*],
> 
>  : The reason why this triggers in VirtualBox and not in Qemu is that
> there are
>  : cases where VirtualBox marks CR4 dirty even though it hasn't
> changed.
> 
> simply not trying to stuff register state dirty when L2 is active
> sounds like it
> would resolve the issue.
> 
> https://lore.kernel.org/all/af2ede328efee9dc3761333bd47648ee6f752686.camel@xxxxxxxxxxxxxxxxxxxxx
> 
> > We found this issue by running uXen [1] as L2 in VirtualBox/KVM
> > [2].
> > The issue can also easily be reproduced in Qemu/KVM if we force a
> > sreg
> > sync on each call to KVM_RUN [3]. The issue can also be reproduced
> > by
> > running a L2 Windows 10. In the Windows case, CR4.VMXE leaks from
> > L1
> > to L2 causing the OS to blue-screen with a kernel thread exception
> > during TLB invalidation where the following code sequence triggers
> > the
> > issue:
> > 
> > mov rax, cr4 <--- L2 reads CR4 with contents from L1
> > mov rcx, cr4
> > btc 0x7, rax <--- L2 toggles CR4.PGE
> > mov cr4, rax <--- #GP because L2 writes CR4 with reserved bits set
> > mov cr4, rcx
> > 
> > The existing code seems to fixup CR4_READ_SHADOW after calling
> > vmx_set_cr4 except in __set_sregs_common. While we could fix it
> > there
> > as well, it's easier to just handle it centrally.
> > 
> > There might be a similar issue with CR0.
> > 
> > [1] https://github.com/OpenXT/uxen
> > [2] https://github.com/cyberus-technology/virtualbox-kvm
> > [3]
> > https://github.com/tpressure/qemu/commit/d64c9d5e76f3f3b747bea7653d677bd61e13aafe
> > 
> > Signed-off-by: Julian Stecklina
> > <julian.stecklina@xxxxxxxxxxxxxxxxxxxxx>
> > Signed-off-by: Thomas Prescher
> > <thomas.prescher@xxxxxxxxxxxxxxxxxxxxx>
> 
> SoB is reversed, yours should come after Thomas'.
> 
> > ---
> >  arch/x86/kvm/vmx/vmx.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 6780313914f8..0d4af00245f3 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -3474,7 +3474,11 @@ void vmx_set_cr4(struct kvm_vcpu *vcpu,
> > unsigned long cr4)
> >  			hw_cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP |
> > X86_CR4_PKE);
> >  	}
> >  
> > -	vmcs_writel(CR4_READ_SHADOW, cr4);
> > +	if (is_guest_mode(vcpu))
> > +		vmcs_writel(CR4_READ_SHADOW,
> > nested_read_cr4(get_vmcs12(vcpu)));
> > +	else
> > +		vmcs_writel(CR4_READ_SHADOW, cr4);
> > +
> >  	vmcs_writel(GUEST_CR4, hw_cr4);
> >  
> >  	if ((cr4 ^ old_cr4) & (X86_CR4_OSXSAVE | X86_CR4_PKE))
> > -- 
> > 2.43.2
> > 





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux