Re: [PATCH v1 11/13] KVM: x86/mmu: Split huge pages during CLEAR_DIRTY_LOG

Peter Xu <peterx@xxxxxxxxxx> · Wed, 5 Jan 2022 17:02:06 +0800

On Mon, Dec 13, 2021 at 10:59:16PM +0000, David Matlack wrote:
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index c9e5fe290714..55640d73df5a 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -1362,6 +1362,20 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
>  		gfn_t start = slot->base_gfn + gfn_offset + __ffs(mask);
>  		gfn_t end = slot->base_gfn + gfn_offset + __fls(mask);
>  
> +		/*
> +		 * Try to proactively split any huge pages down to 4KB so that
> +		 * vCPUs don't have to take write-protection faults.
> +		 *
> +		 * Drop the MMU lock since huge page splitting uses its own
> +		 * locking scheme and does not require the write lock in all
> +		 * cases.
> +		 */
> +		if (READ_ONCE(eagerly_split_huge_pages_for_dirty_logging)) {
> +			write_unlock(&kvm->mmu_lock);
> +			kvm_mmu_try_split_huge_pages(kvm, slot, start, end, PG_LEVEL_4K);
> +			write_lock(&kvm->mmu_lock);
> +		}
> +
>  		kvm_mmu_slot_gfn_write_protect(kvm, slot, start, PG_LEVEL_2M);

Would it be easier to just allow passing in shared=true/false for the new
kvm_mmu_try_split_huge_pages(), then previous patch will not be needed?  Or is
it intended to do it for performance reasons?

IOW, I think this patch does two things: (1) support clear-log on eager split,
and (2) allow lock degrade during eager split.

It's just that imho (2) may still need some justification on necessity since
this function only operates on a very small range of guest mem (at most
4K*64KB=256KB range), so it's not clear to me whether the extra lock operations
are needed at all; after all it'll make the code slightly harder to follow.
Not to mention the previous patch is preparing for this, and both patches will
add lock operations.

I think dirty_log_perf_test didn't cover lock contention case, because clear
log was run after vcpu threads stopped, so lock access should be mostly hitting
the cachelines there, afaict.  While in real life, clear log is run with vcpus
running.  Not sure whether that'll be a problem, so raising this question up.

Thanks,

-- 
Peter Xu