Fwd: [RFC] Keeping host value of CR4.MCE (and other bits) live during guest execution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There's a bug in kvm/vmx.c: if the host enabled machine check (CR4.MCE==1),
the value gets zeroed while the CPU is running in guest context.
If a machine check event arrives while the CPU is in guest context and
effective CR4.MCE is zero, the machine raises CATERR and crashes.

We should make sure the host value of CR4.MCE is always active.  There
are read and write shadows for the guest to think it wrote its own value.

For discussion: there's new complexity with CR4 shadowing
(1e02ce4cccdcb9688386e5b8d2c9fa4660b45389).  I measure CR4 reads at 24
cycles on haswell and 36 on sandybridge, which compares well with
L2 miss costs.  Is the shadowing worth the complexity?  CR4 is also
cached (with no real consistency mechanism) in the VMCS at the time
of guest VCPU creation.  If there is ever a change in CR4 value
over time, or if CR4 is different on different CPUs in the system, all this
logic gets broken.

Thanks,
Ben


---

The host's decision to enable machine check exceptions should remain
in force during non-root mode.  KVM was writing 0 to cr4 on VCPU reset
and passed a slightly-modified 0 to the vmcs.guest_cr4 value.

Tested: Inject machine check while a guest is spinning.
Before the change, if guest CR4.MCE==0, then the machine check is
escalated to Catastrophic Error (CATERR) and the machine dies.
If guest CR4.MCE==1, then the machine check causes VMEXIT and is
handled normally by host Linux. After the change, injecting a machine
check causes normal Linux machine check handling.


---
 arch/x86/kvm/vmx.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a214104..44c8d24 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3456,8 +3456,16 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu,
unsigned long cr3)

 static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 {
-       unsigned long hw_cr4 = cr4 | (to_vmx(vcpu)->rmode.vm86_active ?
-                   KVM_RMODE_VM_CR4_ALWAYS_ON : KVM_PMODE_VM_CR4_ALWAYS_ON);
+       /*
+        * Pass through host's Machine Check Enable value to hw_cr4, which
+        * is in force while we are in guest mode.  Do not let guests control
+        * this bit, even if host CR4.MCE == 0.
+        */
+       unsigned long hw_cr4 =
+               (read_cr4() & X86_CR4_MCE) |
+               (cr4 & ~X86_CR4_MCE) |
+               (to_vmx(vcpu)->rmode.vm86_active ?
+                KVM_RMODE_VM_CR4_ALWAYS_ON : KVM_PMODE_VM_CR4_ALWAYS_ON);

        if (cr4 & X86_CR4_VMXE) {
                /*
--
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux