Re: [PATCH] KVM: nVMX: Honor userspace MSR filter lists for nested VM-Enter/VM-Exit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 23, 2024 at 07:59:22AM +0800, Sean Christopherson wrote:
> Synthesize a consistency check VM-Exit (VM-Enter) or VM-Abort (VM-Exit) if
> L1 attempts to load/store an MSR via the VMCS MSR lists that userspace has
> disallowed access to via an MSR filter.  Intel already disallows including
> a handful of "special" MSRs in the VMCS lists, so denying access isn't
> completely without precedent.
> 
> More importantly, the behavior is well-defined _and_ can be communicated
> the end user, e.g. to the customer that owns a VM running as L1 on top of
> KVM.  On the other hand, ignoring userspace MSR filters is all but
> guaranteed to result in unexpected behavior as the access will hit KVM's
> internal state, which is likely not up-to-date.
> 
> Unlike KVM-internal accesses, instruction emulation, and dedicated VMCS
> fields, the MSRs in the VMCS load/store lists are 100% guest controlled,
> thus making it all but impossible to reason about the correctness of
> ignoring the MSR filter.  And if userspace *really* wants to deny access
> to MSRs via the aforementioned scenarios, userspace can hide the
> associated feature from the guest, e.g. by disabling the PMU to prevent
> accessing PERF_GLOBAL_CTRL via its VMCS field.  But for the MSR lists, KVM
> is blindly processing MSRs; the  MSR filters are the _only_ way for
> userspace to deny access.
> 
> This partially reverts commit ac8d6cad3c7b ("KVM: x86: Only do MSR
> filtering when access MSR by rdmsr/wrmsr").
> 
> Cc: Hou Wenlong <houwenlong.hwl@xxxxxxxxxxxx>
> Cc: Jim Mattson <jmattson@xxxxxxxxxx>
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> 
> I found this by inspection when backporting Hou's change to an internal kernel.
> I don't love piggybacking Intel's "you can't touch these special MSRs" behavior,
> but ignoring the userspace MSR filters is far worse IMO.  E.g. if userspace is
> denying access to an MSR in order to reduce KVM's attack surface, letting L1
> sneak in reads/writes through VM-Enter/VM-Exit completely circumvents the
> filters.
> 
>  Documentation/virt/kvm/api.rst  | 19 ++++++++++++++++---
>  arch/x86/include/asm/kvm_host.h |  2 ++
>  arch/x86/kvm/vmx/nested.c       | 12 ++++++------
>  arch/x86/kvm/x86.c              |  6 ++++--
>  4 files changed, 28 insertions(+), 11 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 8e5dad80b337..e6b1e42186f3 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -4226,9 +4226,22 @@ filtering. In that mode, ``KVM_MSR_FILTER_DEFAULT_DENY`` is invalid and causes
>  an error.
>  
>  .. warning::
> -   MSR accesses as part of nested VM-Enter/VM-Exit are not filtered.
> -   This includes both writes to individual VMCS fields and reads/writes
> -   through the MSR lists pointed to by the VMCS.
> +   MSR accesses that are side effects of instruction execution (emulated or
> +   native) are not filtered as hardware does not honor MSR bitmaps outside of
> +   RDMSR and WRMSR, and KVM mimics that behavior when emulating instructions
> +   to avoid pointless divergence from hardware.  E.g. RDPID reads MSR_TSC_AUX,
> +   SYSENTER reads the SYSENTER MSRs, etc.
> +
> +   MSRs that are loaded/stored via dedicated VMCS fields are not filtered as
> +   part of VM-Enter/VM-Exit emulation.
> +
> +   MSRs that are loaded/store via VMX's load/store lists _are_ filtered as part
> +   of VM-Enter/VM-Exit emulation.  If an MSR access is denied on VM-Enter, KVM
> +   synthesizes a consistency check VM-Exit(EXIT_REASON_MSR_LOAD_FAIL).  If an
> +   MSR access is denied on VM-Exit, KVM synthesizes a VM-Abort.  In short, KVM
> +   extends Intel's architectural list of MSRs that cannot be loaded/saved via
> +   the VM-Enter/VM-Exit MSR list.  It is platform owner's responsibility to
> +   to communicate any such restrictions to their end users.
>
Do we also need to modify the statement before this warning? Since
the behaviour is different from RDMSR/WRMSR emulation case.

```
if an MSR access is denied by userspace the resulting KVM behavior depends on
whether or not KVM_CAP_X86_USER_SPACE_MSR's KVM_MSR_EXIT_REASON_FILTER is
enabled.  If KVM_MSR_EXIT_REASON_FILTER is enabled, KVM will exit to userspace
on denied accesses, i.e. userspace effectively intercepts the MSR access.
```

>     x2APIC MSR accesses cannot be filtered (KVM silently ignores filters that
>     cover any x2APIC MSRs).
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 950a03e0181e..94d0bedc42ee 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -2059,6 +2059,8 @@ void kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu);
>  
>  void kvm_enable_efer_bits(u64);
>  bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer);
> +int kvm_get_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 *data);
> +int kvm_set_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 data);
>  int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, bool host_initiated);
>  int kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data);
>  int kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data);
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 2392a7ef254d..674f7089cc44 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -981,7 +981,7 @@ static u32 nested_vmx_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
>  				__func__, i, e.index, e.reserved);
>  			goto fail;
>  		}
> -		if (kvm_set_msr(vcpu, e.index, e.value)) {
> +		if (kvm_set_msr_with_filter(vcpu, e.index, e.value)) {
>  			pr_debug_ratelimited(
>  				"%s cannot write MSR (%u, 0x%x, 0x%llx)\n",
>  				__func__, i, e.index, e.value);
> @@ -1017,7 +1017,7 @@ static bool nested_vmx_get_vmexit_msr_value(struct kvm_vcpu *vcpu,
>  		}
>  	}
>  
> -	if (kvm_get_msr(vcpu, msr_index, data)) {
> +	if (kvm_get_msr_with_filter(vcpu, msr_index, data)) {
>  		pr_debug_ratelimited("%s cannot read MSR (0x%x)\n", __func__,
>  			msr_index);
>  		return false;
> @@ -1112,9 +1112,9 @@ static void prepare_vmx_msr_autostore_list(struct kvm_vcpu *vcpu,
>  			/*
>  			 * Emulated VMEntry does not fail here.  Instead a less
>  			 * accurate value will be returned by
> -			 * nested_vmx_get_vmexit_msr_value() using kvm_get_msr()
> -			 * instead of reading the value from the vmcs02 VMExit
> -			 * MSR-store area.
> +			 * nested_vmx_get_vmexit_msr_value() by reading KVM's
> +			 * internal MSR state instead of reading the value from
> +			 * the vmcs02 VMExit MSR-store area.
>  			 */
>  			pr_warn_ratelimited(
>  				"Not enough msr entries in msr_autostore.  Can't add msr %x\n",
> @@ -4806,7 +4806,7 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
>  				goto vmabort;
>  			}
>  
> -			if (kvm_set_msr(vcpu, h.index, h.value)) {
> +			if (kvm_set_msr_with_filter(vcpu, h.index, h.value)) {
>  				pr_debug_ratelimited(
>  					"%s WRMSR failed (%u, 0x%x, 0x%llx)\n",
>  					__func__, j, h.index, h.value);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index af6c8cf6a37a..7b3659a05c27 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1942,19 +1942,21 @@ static int kvm_get_msr_ignored_check(struct kvm_vcpu *vcpu,
>  	return ret;
>  }
>  
> -static int kvm_get_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 *data)
> +int kvm_get_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 *data)
>  {
>  	if (!kvm_msr_allowed(vcpu, index, KVM_MSR_FILTER_READ))
>  		return KVM_MSR_RET_FILTERED;
>  	return kvm_get_msr_ignored_check(vcpu, index, data, false);
>  }
> +EXPORT_SYMBOL_GPL(kvm_get_msr_with_filter);
>  
> -static int kvm_set_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 data)
> +int kvm_set_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 data)
>  {
>  	if (!kvm_msr_allowed(vcpu, index, KVM_MSR_FILTER_WRITE))
>  		return KVM_MSR_RET_FILTERED;
>  	return kvm_set_msr_ignored_check(vcpu, index, data, false);
>  }
> +EXPORT_SYMBOL_GPL(kvm_set_msr_with_filter);
>  
>  int kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data)
>  {
> 
> base-commit: 332d2c1d713e232e163386c35a3ba0c1b90df83f
> -- 
> 2.45.2.1089.g2a221341d9-goog




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux