On Mon, Oct 09, 2023, Sean Christopherson wrote: > On Sat, Oct 07, 2023, Like Xu wrote: > > On 14/7/2023 2:50 pm, Yan Zhao wrote: > > > diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h > > > index 92d5a1924fc1..38bd449226f6 100644 > > > --- a/arch/x86/kvm/mmu.h > > > +++ b/arch/x86/kvm/mmu.h > > > @@ -235,6 +235,13 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > > > return -(u32)fault & errcode; > > > } > > > +bool __kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, bool vm_has_noncoherent_dma); > > > + > > > +static inline bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm) > > > +{ > > > + return __kvm_mmu_honors_guest_mtrrs(kvm, kvm_arch_has_noncoherent_dma(kvm)); > > > +} > > > + > > > void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end); > > > int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > > index 1e5db621241f..b4f89f015c37 100644 > > > --- a/arch/x86/kvm/mmu/mmu.c > > > +++ b/arch/x86/kvm/mmu/mmu.c > > > @@ -4516,6 +4516,21 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu, > > > } > > > #endif > > > +bool __kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, bool vm_has_noncoherent_dma) > > > > According to the motivation provided in the comment, the function will no > > longer need to be passed the parameter "struct kvm *kvm" but will rely on > > the global parameters (plus vm_has_noncoherent_dma), removing "*kvm" ? > > Yeah, I'll fixup the commit to drop @kvm from the inner helper. Thanks! Gah, and I gave more bad advice when I suggested this idea. There's no need to explicitly check tdp_enabled, as shadow_memtype_mask is set to zero if TDP is disabled. And that must be the case, e.g. make_spte() would generate a corrupt shadow_memtype_mask were non-zero on Intel with shadow paging. Yan, can you take a look at what I ended up with (see below) to make sure it looks sane/acceptable to you? New hashes (assuming I didn't botch things and need even more fixup). [1/5] KVM: x86/mmu: Add helpers to return if KVM honors guest MTRRs https://github.com/kvm-x86/linux/commit/ec1d8217d59b [2/5] KVM: x86/mmu: Zap SPTEs when CR0.CD is toggled iff guest MTRRs are honored https://github.com/kvm-x86/linux/commit/40de16c10b9d [3/5] KVM: x86/mmu: Zap SPTEs on MTRR update iff guest MTRRs are honored https://github.com/kvm-x86/linux/commit/defc3fae8d0f [4/5] KVM: x86/mmu: Zap KVM TDP when noncoherent DMA assignment starts/stops https://github.com/kvm-x86/linux/commit/b344d331adeb [5/5] KVM: VMX: drop IPAT in memtype when CD=1 for KVM_X86_QUIRK_CD_NW_CLEARED https://github.com/kvm-x86/linux/commit/a4d14445c47d commit ec1d8217d59bd7cb03ae4e80551fee987be98a4e Author: Yan Zhao <yan.y.zhao@xxxxxxxxx> Date: Fri Jul 14 14:50:06 2023 +0800 KVM: x86/mmu: Add helpers to return if KVM honors guest MTRRs Add helpers to check if KVM honors guest MTRRs instead of open coding the logic in kvm_tdp_page_fault(). Future fixes and cleanups will also need to determine if KVM should honor guest MTRRs, e.g. for CR0.CD toggling and and non-coherent DMA transitions. Provide an inner helper, __kvm_mmu_honors_guest_mtrrs(), so that KVM can if guest MTRRs were honored when stopping non-coherent DMA. Note, there is no need to explicitly check that TDP is enabled, KVM clears shadow_memtype_mask when TDP is disabled, i.e. it's non-zero if and only if EPT is enabled. Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx> Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx> Link: https://lore.kernel.org/r/20230714065006.20201-1-yan.y.zhao@xxxxxxxxx Link: https://lore.kernel.org/r/20230714065043.20258-1-yan.y.zhao@xxxxxxxxx [sean: squash into a one patch, drop explicit TDP check massage changelog] Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 253fb2093d5d..bb8c86eefac0 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -237,6 +237,13 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, return -(u32)fault & errcode; } +bool __kvm_mmu_honors_guest_mtrrs(bool vm_has_noncoherent_dma); + +static inline bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm) +{ + return __kvm_mmu_honors_guest_mtrrs(kvm_arch_has_noncoherent_dma(kvm)); +} + void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end); int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index f7901cb4d2fa..5d3dc7119e57 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4479,21 +4479,28 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu, } #endif +bool __kvm_mmu_honors_guest_mtrrs(bool vm_has_noncoherent_dma) +{ + /* + * If host MTRRs are ignored (shadow_memtype_mask is non-zero), and the + * VM has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is + * to honor the memtype from the guest's MTRRs so that guest accesses + * to memory that is DMA'd aren't cached against the guest's wishes. + * + * Note, KVM may still ultimately ignore guest MTRRs for certain PFNs, + * e.g. KVM will force UC memtype for host MMIO. + */ + return vm_has_noncoherent_dma && shadow_memtype_mask; +} + int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) { /* * If the guest's MTRRs may be used to compute the "real" memtype, * restrict the mapping level to ensure KVM uses a consistent memtype - * across the entire mapping. If the host MTRRs are ignored by TDP - * (shadow_memtype_mask is non-zero), and the VM has non-coherent DMA - * (DMA doesn't snoop CPU caches), KVM's ABI is to honor the memtype - * from the guest's MTRRs so that guest accesses to memory that is - * DMA'd aren't cached against the guest's wishes. - * - * Note, KVM may still ultimately ignore guest MTRRs for certain PFNs, - * e.g. KVM will force UC memtype for host MMIO. + * across the entire mapping. */ - if (shadow_memtype_mask && kvm_arch_has_noncoherent_dma(vcpu->kvm)) { + if (kvm_mmu_honors_guest_mtrrs(vcpu->kvm)) { for ( ; fault->max_level > PG_LEVEL_4K; --fault->max_level) { int page_num = KVM_PAGES_PER_HPAGE(fault->max_level); gfn_t base = gfn_round_for_level(fault->gfn,