On Wed, Jan 5, 2022 at 9:55 AM David Matlack <dmatlack@xxxxxxxxxx> wrote: > > On Wed, Jan 5, 2022 at 1:02 AM Peter Xu <peterx@xxxxxxxxxx> wrote: > > > > On Mon, Dec 13, 2021 at 10:59:16PM +0000, David Matlack wrote: > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > > index c9e5fe290714..55640d73df5a 100644 > > > --- a/arch/x86/kvm/mmu/mmu.c > > > +++ b/arch/x86/kvm/mmu/mmu.c > > > @@ -1362,6 +1362,20 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, > > > gfn_t start = slot->base_gfn + gfn_offset + __ffs(mask); > > > gfn_t end = slot->base_gfn + gfn_offset + __fls(mask); > > > > > > + /* > > > + * Try to proactively split any huge pages down to 4KB so that > > > + * vCPUs don't have to take write-protection faults. > > > + * > > > + * Drop the MMU lock since huge page splitting uses its own > > > + * locking scheme and does not require the write lock in all > > > + * cases. > > > + */ > > > + if (READ_ONCE(eagerly_split_huge_pages_for_dirty_logging)) { > > > + write_unlock(&kvm->mmu_lock); > > > + kvm_mmu_try_split_huge_pages(kvm, slot, start, end, PG_LEVEL_4K); > > > + write_lock(&kvm->mmu_lock); > > > + } > > > + > > > kvm_mmu_slot_gfn_write_protect(kvm, slot, start, PG_LEVEL_2M); > > > > Would it be easier to just allow passing in shared=true/false for the new > > kvm_mmu_try_split_huge_pages(), then previous patch will not be needed? Or is > > it intended to do it for performance reasons? > > > > IOW, I think this patch does two things: (1) support clear-log on eager split, > > and (2) allow lock degrade during eager split. > > > > It's just that imho (2) may still need some justification on necessity since > > this function only operates on a very small range of guest mem (at most > > 4K*64KB=256KB range), so it's not clear to me whether the extra lock operations > > are needed at all; after all it'll make the code slightly harder to follow. > > Not to mention the previous patch is preparing for this, and both patches will > > add lock operations. > > > > I think dirty_log_perf_test didn't cover lock contention case, because clear > > log was run after vcpu threads stopped, so lock access should be mostly hitting > > the cachelines there, afaict. While in real life, clear log is run with vcpus > > running. Not sure whether that'll be a problem, so raising this question up. > > Good point. Dropping the write lock to acquire the read lock is > probably not necessary since we're splitting a small region of memory > here. Plus the splitting code detects contention and will drop the > lock if necessary. And the value of dropping the lock is dubious since > it adds a lot more lock operations. I wasn't very clear here. I meant the value of "dropping the write lock to switch the read lock every time we split" is dubious since it adds more lock operations. Dropping the lock and yielding when there's contention detected is not dubious :). > > I'll try your suggestion in v3. > > > > > Thanks, > > > > > > -- > > Peter Xu > >