Re: [RFC PATCH 13/15] KVM: x86/mmu: Split large pages during CLEAR_DIRTY_LOG

Peter Xu <peterx@xxxxxxxxxx> · Fri, 26 Nov 2021 20:17:12 +0800

On Fri, Nov 19, 2021 at 11:57:57PM +0000, David Matlack wrote:
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 6768ef9c0891..4e78ef2dd352 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -1448,6 +1448,12 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
>  		gfn_t start = slot->base_gfn + gfn_offset + __ffs(mask);
>  		gfn_t end = slot->base_gfn + gfn_offset + __fls(mask);
>  
> +		/*
> +		 * Try to proactively split any large pages down to 4KB so that
> +		 * vCPUs don't have to take write-protection faults.
> +		 */
> +		kvm_mmu_try_split_large_pages(kvm, slot, start, end, PG_LEVEL_4K);
> +
>  		kvm_mmu_slot_gfn_write_protect(kvm, slot, start, PG_LEVEL_2M);
>  
>  		/* Cross two large pages? */

Is it intended to try split every time even if we could have split it already?
As I remember Paolo mentioned that we can skip split if it's not the 1st
CLEAR_LOG on the same range, and IIUC that makes sense.

But indeed I don't see a trivial way to know whether this is the first clear of
this range.  Maybe we can maintain "how many huge pages are there under current
kvm_mmu_page node" somehow?  Then if root sp has the counter==0, then we can
skip it.  Just a wild idea..

Or maybe it's intended to try split unconditionally for some reason?  If so
it'll be great to mention that either in the commit message or in comments.

Thanks,

-- 
Peter Xu