Re: [PATCH v1 11/13] KVM: x86/mmu: Split huge pages during CLEAR_DIRTY_LOG

David Matlack <dmatlack@xxxxxxxxxx> · Wed, 5 Jan 2022 09:55:02 -0800

On Wed, Jan 5, 2022 at 1:02 AM Peter Xu <peterx@xxxxxxxxxx> wrote:
>
> On Mon, Dec 13, 2021 at 10:59:16PM +0000, David Matlack wrote:
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index c9e5fe290714..55640d73df5a 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -1362,6 +1362,20 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
> >               gfn_t start = slot->base_gfn + gfn_offset + __ffs(mask);
> >               gfn_t end = slot->base_gfn + gfn_offset + __fls(mask);
> >
> > +             /*
> > +              * Try to proactively split any huge pages down to 4KB so that
> > +              * vCPUs don't have to take write-protection faults.
> > +              *
> > +              * Drop the MMU lock since huge page splitting uses its own
> > +              * locking scheme and does not require the write lock in all
> > +              * cases.
> > +              */
> > +             if (READ_ONCE(eagerly_split_huge_pages_for_dirty_logging)) {
> > +                     write_unlock(&kvm->mmu_lock);
> > +                     kvm_mmu_try_split_huge_pages(kvm, slot, start, end, PG_LEVEL_4K);
> > +                     write_lock(&kvm->mmu_lock);
> > +             }
> > +
> >               kvm_mmu_slot_gfn_write_protect(kvm, slot, start, PG_LEVEL_2M);
>
> Would it be easier to just allow passing in shared=true/false for the new
> kvm_mmu_try_split_huge_pages(), then previous patch will not be needed?  Or is
> it intended to do it for performance reasons?
>
> IOW, I think this patch does two things: (1) support clear-log on eager split,
> and (2) allow lock degrade during eager split.
>
> It's just that imho (2) may still need some justification on necessity since
> this function only operates on a very small range of guest mem (at most
> 4K*64KB=256KB range), so it's not clear to me whether the extra lock operations
> are needed at all; after all it'll make the code slightly harder to follow.
> Not to mention the previous patch is preparing for this, and both patches will
> add lock operations.
>
> I think dirty_log_perf_test didn't cover lock contention case, because clear
> log was run after vcpu threads stopped, so lock access should be mostly hitting
> the cachelines there, afaict.  While in real life, clear log is run with vcpus
> running.  Not sure whether that'll be a problem, so raising this question up.

Good point. Dropping the write lock to acquire the read lock is
probably not necessary since we're splitting a small region of memory
here. Plus the splitting code detects contention and will drop the
lock if necessary. And the value of dropping the lock is dubious since
it adds a lot more lock operations.

I'll try your suggestion in v3.

>
> Thanks,

>
> --
> Peter Xu
>