[Resending as plain text] On Mon, Feb 5, 2018 at 10:21 AM Jim Mattson <jmattson@xxxxxxxxxx> wrote: > This is incorrect. In the event of an early VM-entry failure (e.g. a > VM-entry failure for "VM entry with invalid control field(s)"), no host > state should be loaded from the VMCS12. Of course, no guest state should > have been loaded from the VMCS12 either, but that's a problem we have with > deferring some VMCS12 control field checks to the hardware. > CR4 should be unchanged from the time of the VMLAUNCH/VMRESUME. There is > no guarantee that vmcs12->host_cr4 holds the correct value. > On Mon, Feb 5, 2018 at 3:05 AM Wanpeng Li <kernellwp@xxxxxxxxx> wrote: >> From: Wanpeng Li <wanpengli@xxxxxxxxxxx> >> In L0, Haswell client host: >> nested_vmx_exit_reflected failed vm entry 7 >> WARNING: CPU: 6 PID: 6797 at kvm/arch/x86/kvm//vmx.c:6206 >> handle_desc+0x2d/0x40 [kvm_intel] >> CPU: 6 PID: 6797 Comm: qemu-system-x86 Tainted: G W OE >> 4.15.0+ #4 >> RIP: 0010:handle_desc+0x2d/0x40 [kvm_intel] >> Call Trace: >> vmx_handle_exit+0xbd/0xe20 [kvm_intel] >> ? kvm_arch_vcpu_ioctl_run+0xcde/0x1c00 [kvm] >> kvm_arch_vcpu_ioctl_run+0xd5a/0x1c00 [kvm] >> kvm_vcpu_ioctl+0x3e9/0x720 [kvm] >> ? kvm_vcpu_ioctl+0x3e9/0x720 [kvm] >> ? __fget+0xfc/0x210 >> ? __fget+0xfc/0x210 >> do_vfs_ioctl+0xa4/0x6a0 >> ? __fget+0x11d/0x210 >> SyS_ioctl+0x79/0x90 >> entry_SYSCALL_64_fastpath+0x25/0x9c >> This can be reproduced by running kvm-unit-tests/run_tests.sh >> vmx_controls in >> L1. UMIP CPUID bit is exposed to the L1 UMIP aware guest since it is >> emulated >> by enabling descriptor-table exits on L0. There is a vmentry fail when >> L0 tries to run L2 directly, the L1 guest architectural CR4 is not >> restored >> after this failure since commit 4f350c6dbcb (kvm: nVMX: Handle deferred >> early >> VMLAUNCH/VMRESUME failure properly). The L2 is kvm-unit-tests which will >> not >> write CR4 w/ X86_CR4_UMIP bit. After another L1 access descriptor vmexit, >> we >> check L2's architectural CR4 instead of L1's architectural CR4. This >> patch >> fixes it by restoring L1's architectural CR4 after L0's VMLAUNCH/VMRESUME >> failure. >> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx> >> Signed-off-by: Wanpeng Li <wanpengli@xxxxxxxxxxx> >> --- >> arch/x86/kvm/vmx.c | 1 + >> 1 file changed, 1 insertion(+) >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> index 23789c9..9fc0492 100644 >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -11633,6 +11633,7 @@ static void nested_vmx_vmexit(struct kvm_vcpu >> *vcpu, u32 exit_reason, >> */ >> nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD); >> + vcpu->arch.cr4 = vmcs12->host_cr4; >> load_vmcs12_mmu_host_state(vcpu, vmcs12); >> /* >> -- >> 2.7.4