Re: [PATCH 5/5] KVM: VMX: Always honor guest PAT on CPUs that support self-snoop

Chao Gao <chao.gao@xxxxxxxxx> · Mon, 25 Mar 2024 11:43:02 +0800

On Fri, Mar 08, 2024 at 05:09:29PM -0800, Sean Christopherson wrote:
>Unconditionally honor guest PAT on CPUs that support self-snoop, as
>Intel has confirmed that CPUs that support self-snoop always snoop caches
>and store buffers.  I.e. CPUs with self-snoop maintain cache coherency
>even in the presence of aliased memtypes, thus there is no need to trust
>the guest behaves and only honor PAT as a last resort, as KVM does today.
>
>Honoring guest PAT is desirable for use cases where the guest has access
>to non-coherent DMA _without_ bouncing through VFIO, e.g. when a virtual
>(mediated, for all intents and purposes) GPU is exposed to the guest, along
>with buffers that are consumed directly by the physical GPU, i.e. which
>can't be proxied by the host to ensure writes from the guest are performed
>with the correct memory type for the GPU.
>
>Cc: Yiwei Zhang <zzyiwei@xxxxxxxxxx>
>Suggested-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
>Suggested-by: Kevin Tian <kevin.tian@xxxxxxxxx>
>Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
>---
> arch/x86/kvm/mmu/mmu.c |  8 +++++---
> arch/x86/kvm/vmx/vmx.c | 10 ++++++----
> 2 files changed, 11 insertions(+), 7 deletions(-)
>
>diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
>index 403cd8f914cd..7fa514830628 100644
>--- a/arch/x86/kvm/mmu/mmu.c
>+++ b/arch/x86/kvm/mmu/mmu.c
>@@ -4622,14 +4622,16 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu,
> bool kvm_mmu_may_ignore_guest_pat(void)
> {
> 	/*
>-	 * When EPT is enabled (shadow_memtype_mask is non-zero), and the VM
>+	 * When EPT is enabled (shadow_memtype_mask is non-zero), the CPU does
>+	 * not support self-snoop (or is affected by an erratum), and the VM
> 	 * has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is to
> 	 * honor the memtype from the guest's PAT so that guest accesses to
> 	 * memory that is DMA'd aren't cached against the guest's wishes.  As a
> 	 * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA,
>-	 * KVM _always_ ignores guest PAT (when EPT is enabled).
>+	 * KVM _always_ ignores or honors guest PAT, i.e. doesn't toggle SPTE
>+	 * bits in response to non-coherent device (un)registration.
> 	 */
>-	return shadow_memtype_mask;
>+	return !static_cpu_has(X86_FEATURE_SELFSNOOP) && shadow_memtype_mask;
> }
> 
> int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
>diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>index 17a8e4fdf9c4..5dc4c24ae203 100644
>--- a/arch/x86/kvm/vmx/vmx.c
>+++ b/arch/x86/kvm/vmx/vmx.c
>@@ -7605,11 +7605,13 @@ static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
> 
> 	/*
> 	 * Force WB and ignore guest PAT if the VM does NOT have a non-coherent
>-	 * device attached.  Letting the guest control memory types on Intel
>-	 * CPUs may result in unexpected behavior, and so KVM's ABI is to trust
>-	 * the guest to behave only as a last resort.
>+	 * device attached and the CPU doesn't support self-snoop.  Letting the
>+	 * guest control memory types on Intel CPUs without self-snoop may
>+	 * result in unexpected behavior, and so KVM's (historical) ABI is to
>+	 * trust the guest to behave only as a last resort.
> 	 */
>-	if (!kvm_arch_has_noncoherent_dma(vcpu->kvm))
>+	if (!static_cpu_has(X86_FEATURE_SELFSNOOP) &&
>+	    !kvm_arch_has_noncoherent_dma(vcpu->kvm))
> 		return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT;

W/ this change, guests w/o pass-thru devices can also access UC memory. Locking
UC memory leads to bus lock. So, guests w/o pass-thru devices can potentially
launch DOS attacks on other CPUs on host. isn't it a problem?