Re: [RFC PATCH 13/15] KVM: x86/mmu: Split large pages during CLEAR_DIRTY_LOG

David Matlack <dmatlack@xxxxxxxxxx> · Wed, 1 Dec 2021 14:17:20 -0800

On Wed, Dec 1, 2021 at 11:22 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Fri, Nov 19, 2021, David Matlack wrote:
> > When using initially-all-set, large pages are not write-protected when
> > dirty logging is enabled on the memslot. Instead they are
> > write-protected once userspace invoked CLEAR_DIRTY_LOG for the first
> > time, and only for the specific sub-region of the memslot that userspace
> > whishes to clear.
> >
> > Enhance CLEAR_DIRTY_LOG to also try to split large pages prior to
> > write-protecting to avoid causing write-protection faults on vCPU
> > threads. This also allows userspace to smear the cost of large page
> > splitting across multiple ioctls rather than splitting the entire
> > memslot when not using initially-all-set.
> >
> > Signed-off-by: David Matlack <dmatlack@xxxxxxxxxx>
> > ---
> >  arch/x86/include/asm/kvm_host.h |  4 ++++
> >  arch/x86/kvm/mmu/mmu.c          | 30 ++++++++++++++++++++++--------
> >  2 files changed, 26 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index 432a4df817ec..6b5bf99f57af 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1591,6 +1591,10 @@ void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
> >  void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
> >                                     const struct kvm_memory_slot *memslot,
> >                                     int start_level);
> > +void kvm_mmu_try_split_large_pages(struct kvm *kvm,
>
> I would prefer we use hugepage when possible, mostly because that's the terminology
> used by the kernel.  KVM is comically inconsistent, but if we make an effort to use
> hugepage when adding new code, hopefully someday we'll have enough inertia to commit
> fully to hugepage.

Will do.

>
> > +                                const struct kvm_memory_slot *memslot,
> > +                                u64 start, u64 end,
> > +                                int target_level);
> >  void kvm_mmu_slot_try_split_large_pages(struct kvm *kvm,
> >                                       const struct kvm_memory_slot *memslot,
> >                                       int target_level);
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 6768ef9c0891..4e78ef2dd352 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -1448,6 +1448,12 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
> >               gfn_t start = slot->base_gfn + gfn_offset + __ffs(mask);
> >               gfn_t end = slot->base_gfn + gfn_offset + __fls(mask);
> >
> > +             /*
> > +              * Try to proactively split any large pages down to 4KB so that
> > +              * vCPUs don't have to take write-protection faults.
> > +              */
> > +             kvm_mmu_try_split_large_pages(kvm, slot, start, end, PG_LEVEL_4K);
>
> This should return a value.  If splitting succeeds, there should be no hugepages
> and so walking the page tables to write-protect 2M is unnecessary.  Same for the
> previous patch, although skipping the write-protect path is a little less
> straightforward in that case.

Great idea! Will do.

>
> > +
> >               kvm_mmu_slot_gfn_write_protect(kvm, slot, start, PG_LEVEL_2M);
> >
> >               /* Cross two large pages? */