Re: [PATCH 1/3] KVM: x86/mmu: Zap only SPs that shadow gPTEs when deleting memslot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Tests of "normal VM + nested VM + 3 selftests" passed on the 3 configs
1) modprobe kvm_intel ept=0,
2) modprobe kvm tdp_mmu=0
   modprobe kvm_intel ept=1
3) modprobe kvm tdp_mmu=1
   modprobe kvm_intel ept=1

Reviewed-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
Tested-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>

On Wed, Oct 09, 2024 at 12:23:43PM -0700, Sean Christopherson wrote:
> When performing a targeted zap on memslot removal, zap only MMU pages that
> shadow guest PTEs, as zapping all SPs that "match" the gfn is inexact and
> unnecessary.  Furthermore, for_each_gfn_valid_sp() arguably shouldn't
> exist, because it doesn't do what most people would it expect it to do.
> The "round gfn for level" adjustment that is done for direct SPs (no gPTE)
> means that the exact gfn comparison will not get a match, even when a SP
> does "cover" a gfn, or was even created specifically for a gfn.
> 
> For memslot deletion specifically, KVM's behavior will vary significantly
> based on the size and alignment of a memslot, and in weird ways.  E.g. for
> a 4KiB memslot, KVM will zap more SPs if the slot is 1GiB aligned than if
> it's only 4KiB aligned.  And as described below, zapping SPs in the
> aligned case overzaps for direct MMUs, as odds are good the upper-level
> SPs are serving other memslots.
> 
> To iterate over all potentially-relevant gfns, KVM would need to make a
> pass over the hash table for each level, with the gfn used for lookup
> rounded for said level.  And then check that the SP is of the correct
> level, too, e.g. to avoid over-zapping.
> 
> But even then, KVM would massively overzap, as processing every level is
> all but guaranteed to zap SPs that serve other memslots, especially if the
> memslot being removed is relatively small.  KVM could mitigate that issue
> by processing only levels that can be possible guest huge pages, i.e. are
> less likely to be re-used for other memslot, but while somewhat logical,
> that's quite arbitrary and would be a bit of a mess to implement.
> 
> So, zap only SPs with gPTEs, as the resulting behavior is easy to describe,
> is predictable, and is explicitly minimal, i.e. KVM only zaps SPs that
> absolutely must be zapped.
> 
> Cc: Yan Zhao <yan.y.zhao@xxxxxxxxx>
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
>  arch/x86/kvm/mmu/mmu.c | 16 ++++++----------
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index a9a23e058555..09494d01c38e 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -1884,14 +1884,10 @@ static bool sp_has_gptes(struct kvm_mmu_page *sp)
>  		if (is_obsolete_sp((_kvm), (_sp))) {			\
>  		} else
>  
> -#define for_each_gfn_valid_sp(_kvm, _sp, _gfn)				\
> +#define for_each_gfn_valid_sp_with_gptes(_kvm, _sp, _gfn)		\
>  	for_each_valid_sp(_kvm, _sp,					\
>  	  &(_kvm)->arch.mmu_page_hash[kvm_page_table_hashfn(_gfn)])	\
> -		if ((_sp)->gfn != (_gfn)) {} else
> -
> -#define for_each_gfn_valid_sp_with_gptes(_kvm, _sp, _gfn)		\
> -	for_each_gfn_valid_sp(_kvm, _sp, _gfn)				\
> -		if (!sp_has_gptes(_sp)) {} else
> +		if ((_sp)->gfn != (_gfn) || !sp_has_gptes(_sp)) {} else
>  
>  static bool kvm_sync_page_check(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
>  {
> @@ -7063,15 +7059,15 @@ static void kvm_mmu_zap_memslot_pages_and_flush(struct kvm *kvm,
>  
>  	/*
>  	 * Since accounting information is stored in struct kvm_arch_memory_slot,
> -	 * shadow pages deletion (e.g. unaccount_shadowed()) requires that all
> -	 * gfns with a shadow page have a corresponding memslot.  Do so before
> -	 * the memslot goes away.
> +	 * all MMU pages that are shadowing guest PTEs must be zapped before the
> +	 * memslot is deleted, as freeing such pages after the memslot is freed
> +	 * will result in use-after-free, e.g. in unaccount_shadowed().
>  	 */
>  	for (i = 0; i < slot->npages; i++) {
>  		struct kvm_mmu_page *sp;
>  		gfn_t gfn = slot->base_gfn + i;
>  
> -		for_each_gfn_valid_sp(kvm, sp, gfn)
> +		for_each_gfn_valid_sp_with_gptes(kvm, sp, gfn)
>  			kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
>  
>  		if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) {
> -- 
> 2.47.0.rc1.288.g06298d1525-goog
> 




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux