On 6/7/22 23:36, Sean Christopherson wrote:
Extend the VMX MSRs quirk to the CR0/4_FIXED1 MSRs, i.e. when the quirk
is disabled, allow userspace to set the MSRs and do not rewrite the MSRs
during CPUID updates. The bits that the guest (L2 in this case) is
allowed to set are not directly tied to CPUID. Enumerating to L1 that it
can set reserved CR0/4 bits is nonsensical and will ultimately result in
a failed nested VM-Entry (KVM enforces guest reserved CR4 bits on top of
the VMX MSRs), but KVM typically doesn't police the vCPU model except
when doing so is necessary to protect the host kernel.
Further restricting CR4 bits is however a reasonable thing to do, e.g. to
work around a bug in nested virtualization, in which case exposing a
feature to L1 is ok, but letting L2 use the feature is not. Of course,
whether or not the L1 hypervisor will actually _check_ the FIXED1 bits is
another matter entirely, e.g. KVM currently assumes all bits that can be
set in the host can also be set in the guest.
Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
I'm leaving this patch out for separate discussion since the quirk is
not needed anymore. I'll post the two reverts after some testing.
Paolo
---
Documentation/virt/kvm/api.rst | 8 ++++++++
arch/x86/kvm/vmx/nested.c | 33 ++++++++++++++++++++++++++++++---
arch/x86/kvm/vmx/vmx.c | 6 +++---
3 files changed, 41 insertions(+), 6 deletions(-)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 1095692ddab7..88d1bbae031e 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7391,6 +7391,14 @@ The valid bits in cap.args[0] are:
IA32_VMX_TRUE_EXIT_CTLS[bit 44]
('load IA32_PERF_GLOBAL_CTRL'). Otherwise,
these corresponding MSR bits are cleared.
+ - MSR_IA32_VMX_CR0_FIXED1 is unconditionally
+ set to 0xffffffff
+ - CR4.PCE is unconditionally set in
+ MSR_IA32_VMX_CR4_FIXED1.
+ - All CR4 bits with an associated CPUID
+ feature flag are set in
+ MSR_IA32_VMX_CR4_FIXED1 if the feature is
+ reported as supported in guest CPUID.
When this quirk is disabled, KVM will not
change the values of the aformentioned VMX
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 5533c2474128..abce74cfefc9 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1385,6 +1385,30 @@ static int vmx_restore_fixed0_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
return 0;
}
+static u64 *vmx_get_fixed1_msr(struct nested_vmx_msrs *msrs, u32 msr_index)
+{
+ switch (msr_index) {
+ case MSR_IA32_VMX_CR0_FIXED1:
+ return &msrs->cr0_fixed1;
+ case MSR_IA32_VMX_CR4_FIXED1:
+ return &msrs->cr4_fixed1;
+ default:
+ BUG();
+ }
+}
+
+static int vmx_restore_fixed1_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
+{
+ const u64 *msr = vmx_get_fixed1_msr(&vmcs_config.nested, msr_index);
+
+ /* Bits that "must-be-0" must not be set in the restored value. */
+ if (!is_bitwise_subset(*msr, data, -1ULL))
+ return -EINVAL;
+
+ *vmx_get_fixed1_msr(&vmx->nested.msrs, msr_index) = data;
+ return 0;
+}
+
/*
* Called when userspace is restoring VMX MSRs.
*
@@ -1432,10 +1456,13 @@ int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
case MSR_IA32_VMX_CR0_FIXED1:
case MSR_IA32_VMX_CR4_FIXED1:
/*
- * These MSRs are generated based on the vCPU's CPUID, so we
- * do not support restoring them directly.
+ * These MSRs are generated based on the vCPU's CPUID when KVM
+ * "owns" the VMX MSRs, do not allow restoring them directly.
*/
- return -EINVAL;
+ if (kvm_check_has_quirk(vmx->vcpu.kvm, KVM_X86_QUIRK_TWEAK_VMX_MSRS))
+ return -EINVAL;
+
+ return vmx_restore_fixed1_msr(vmx, msr_index, data);
case MSR_IA32_VMX_EPT_VPID_CAP:
return vmx_restore_vmx_ept_vpid_cap(vmx, data);
case MSR_IA32_VMX_VMCS_ENUM:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4c31c8f24329..139f365ca6bb 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7520,10 +7520,10 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
~(FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX);
- if (nested_vmx_allowed(vcpu)) {
+ if (nested_vmx_allowed(vcpu) &&
+ kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_TWEAK_VMX_MSRS)) {
nested_vmx_cr_fixed1_bits_update(vcpu);
- if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_TWEAK_VMX_MSRS))
- nested_vmx_entry_exit_ctls_update(vcpu);
+ nested_vmx_entry_exit_ctls_update(vcpu);
}
if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&