Re: [Patch v4 03/18] KVM: x86/mmu: Track count of pages in KVM MMU page caches globally

Vipin Sharma <vipinsh@xxxxxxxxxx> · Wed, 8 Mar 2023 14:16:51 -0800

On Wed, Mar 8, 2023 at 12:33 PM Zhi Wang <zhi.wang.linux@xxxxxxxxx> wrote:
>
> On Mon,  6 Mar 2023 14:41:12 -0800
> Vipin Sharma <vipinsh@xxxxxxxxxx> wrote:
> > +/**
> > + * Caller should hold mutex lock corresponding to cache, if available.
> > + */
> > +static int mmu_topup_sp_memory_cache(struct kvm_mmu_memory_cache *cache,
> > +                                  int min)
> > +{
> > +     int orig_nobjs, r;
> > +
> > +     orig_nobjs = cache->nobjs;
> > +     r = kvm_mmu_topup_memory_cache(cache, min);
> > +     if (orig_nobjs != cache->nobjs)
> > +             percpu_counter_add(&kvm_total_unused_cached_pages,
> > +                                (cache->nobjs - orig_nobjs));
> > +
> > +     return r;
> > +}
> > +
>
> Maybe kvm_mmu_topup_shadow_page_cache() would be better?
>
> As a user of kvm_mmu_topup_memory_cache(), mmu_topup_memory_cache() is not
> supposed to directly touch the kvm_mmu_memory_cache meta data.
>
> The name "mmu_topup_sp_memory_cache()" seems similar with "mmu_topup_memory_cache()".
> Renaming it would make its level self-documenting.
>

Sounds good. I will rename it.

> > @@ -4396,25 +4438,28 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu,
> >       if (r != RET_PF_INVALID)
> >               return r;
> >
> > +     mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock);
>
> Can you elaborate more why this lock is required? When will this lock contend?

This lock is not needed in this patch. In the patch 4 when I am
freeing up the cache in MMU shrinker this lock is used. In an internal
discussion Sean also mentioned it. I will move it to the patch where
it is actually used.

>
> 1) Previously mmu_topup_memory_caches() works fine without a lock.
> 2) IMHO I was suspecting if this lock seems affects the parallelization
> of the TDP MMU fault handling.
>
> TDP MMU fault handling is intend to be optimized for parallelization fault
> handling by taking a read lock and operating the page table via atomic
> operations. Multiple fault handling can enter the TDP MMU fault path
> because of read_lock(&vcpu->kvm->mmu_lock) below.
>
> W/ this lock, it seems the part of benefit of parallelization is gone
> because the lock can contend earlier above. Will this cause performance
> regression?

This is a per vCPU lock, with this lock each vCPU will still be able
to perform parallel fault handling without contending for lock.

>
> If the lock will not contend above, then I am not sure if we need it.
>

Not in this patch, but in patch 4 we will need it when clearing cache
via MMU shrinker. I will move it to the patch where it is actually
needed.