On 09/13/2019 02:06 PM, Sean Christopherson wrote:
On Fri, Sep 13, 2019 at 01:37:55PM -0700, Krish Sadhukhan wrote:
On 9/4/19 8:42 AM, Sean Christopherson wrote:
On Thu, Aug 29, 2019 at 04:56:34PM -0400, Krish Sadhukhan wrote:
Bit# 31 in VM-exit reason is set by hardware in both cases of early VM-entry
failures and VM-entry failures due to invalid guest state.
This is incorrect, VMCS.EXIT_REASON is not written on a VM-Fail. If the
tests are passing, you're probably consuming a stale EXIT_REASON.
In vmx_vcpu_run(),
if (vmx->fail || (vmx->exit_reason &
VMX_EXIT_REASONS_FAILED_VMENTRY))
return;
vmx->loaded_vmcs->launched = 1;
we return without setting "launched" whenever bit# 31 is set in Exit Reason.
If VM-entry fails due to invalid guest state or due to errors in VM-entry
MSR-loading area, bit#31 is set. As a result, L2 is not in "launched" state
when we return to L1. Tests that want to VMRESUME L2 after fixing the bad
guest state or the bad MSR-loading area, fail with VM-Instruction Error 5,
"Early vmresume failure: error number is 5. See Intel 30.4."
Yes, a VMCS isn't marked launched if it generates a VM-Exit due to a
failed consistency check. But as that code shows, a failed consistency
check results in said VM-Exit *or* a VM-Fail. Cosnsitency checks that
fail early, i.e. before loading guest state, generate VM-Fail, any check
that fails after the CPU has started loading guest state manifests as a
VM-Exit. VMCS.EXIT_REASON isn't touched in the VM-Fail case.
E.g. in CHECKS ON VMX CONTROLS AND HOST-STATE AREA, the SDM states:
VM entry fails if any of these checks fail. When such failures occur,
control is passed to the next instruction, RFLAGS.ZF is set to 1 to
indicate the failure, and the VM-instruction error field is loaded with
an error number that indicates whether the failure was due to the
controls or the host-state area (see Chapter 30).
The fix done by Marc Orr in
"[kvm-unit-tests PATCH v2] x86: nvmx: test max atomic switch MSRs"
fixes this problem.