Re: [RFC PATCH 12/15] KVM: x86/mmu: Split large pages when dirty logging is enabled

David Matlack <dmatlack@xxxxxxxxxx> · Tue, 30 Nov 2021 15:56:15 -0800

On Fri, Nov 26, 2021 at 4:01 AM Peter Xu <peterx@xxxxxxxxxx> wrote:
>
> Hi, David,
>
> On Fri, Nov 19, 2021 at 11:57:56PM +0000, David Matlack wrote:
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index 2a7564703ea6..432a4df817ec 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1232,6 +1232,9 @@ struct kvm_arch {
> >       hpa_t   hv_root_tdp;
> >       spinlock_t hv_root_tdp_lock;
> >  #endif
> > +
> > +     /* MMU caches used when splitting large pages during VM-ioctls. */
> > +     struct kvm_mmu_memory_caches split_caches;
>
> Are mmu_gfn_array_cache and mmu_pte_list_desc_cache wasted here?  I saw that
> "struct kvm_mmu_memory_cache" still takes up quite a few hundreds of bytes,
> just want to make sure we won't waste them in vain.

Yes they are wasted right now. But there's a couple of things to keep in mind:

1. They are also wasted in every vCPU (in the per-vCPU caches) that
does not use the shadow MMU.
2. They will (I think) be used eventually when I add Eager Page
Splitting support to the shadow MMU.
3. split_caches is per-VM so it's only a few hundred bytes per VM.

If we really want to save the memory the right way forward might be to
make each kvm_mmu_memory_cache a pointer instead of an embedded
struct. Then we can allocate each dynamically only as needed. I can
add that to my TODO list but I don't think it'd be worth blocking this
on it given the points above.

>
> [...]
>
> > +int mmu_topup_split_caches(struct kvm *kvm)
> > +{
> > +     struct kvm_mmu_memory_caches *split_caches = &kvm->arch.split_caches;
> > +     int r;
> > +
> > +     assert_split_caches_invariants(kvm);
> > +
> > +     r = kvm_mmu_topup_memory_cache(&split_caches->page_header_cache, 1);
> > +     if (r)
> > +             goto out;
> > +
> > +     r = kvm_mmu_topup_memory_cache(&split_caches->shadow_page_cache, 1);
> > +     if (r)
> > +             goto out;
>
> Is it intended to only top-up with one cache object?  IIUC this means we'll try
> to proactively yield the cpu for each of the huge page split right after the
> object is consumed.
>
> Wondering whether it be more efficient to make it a slightly larger number, so
> we don't overload the memory but also make the loop a bit more efficient.

IIUC, 1 here is just the min needed for kvm_mmu_topup_memory_cache to
return success. I chose 1 for each because it's the minimum necessary
to make forward progress (split one large page).

No matter what you pass for min kvm_mmu_topup_memory_cache() will
still always try to allocate KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE
objects.

>
> > +
> > +     return 0;
> > +
> > +out:
> > +     pr_warn("Failed to top-up split caches. Will not split large pages.\n");
> > +     return r;
> > +}
>
> Thanks,
>
> --
> Peter Xu
>