On Wed, Dec 1, 2021 at 11:22 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Fri, Nov 19, 2021, David Matlack wrote: > > When using initially-all-set, large pages are not write-protected when > > dirty logging is enabled on the memslot. Instead they are > > write-protected once userspace invoked CLEAR_DIRTY_LOG for the first > > time, and only for the specific sub-region of the memslot that userspace > > whishes to clear. > > > > Enhance CLEAR_DIRTY_LOG to also try to split large pages prior to > > write-protecting to avoid causing write-protection faults on vCPU > > threads. This also allows userspace to smear the cost of large page > > splitting across multiple ioctls rather than splitting the entire > > memslot when not using initially-all-set. > > > > Signed-off-by: David Matlack <dmatlack@xxxxxxxxxx> > > --- > > arch/x86/include/asm/kvm_host.h | 4 ++++ > > arch/x86/kvm/mmu/mmu.c | 30 ++++++++++++++++++++++-------- > > 2 files changed, 26 insertions(+), 8 deletions(-) > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index 432a4df817ec..6b5bf99f57af 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -1591,6 +1591,10 @@ void kvm_mmu_reset_context(struct kvm_vcpu *vcpu); > > void kvm_mmu_slot_remove_write_access(struct kvm *kvm, > > const struct kvm_memory_slot *memslot, > > int start_level); > > +void kvm_mmu_try_split_large_pages(struct kvm *kvm, > > I would prefer we use hugepage when possible, mostly because that's the terminology > used by the kernel. KVM is comically inconsistent, but if we make an effort to use > hugepage when adding new code, hopefully someday we'll have enough inertia to commit > fully to hugepage. Will do. > > > + const struct kvm_memory_slot *memslot, > > + u64 start, u64 end, > > + int target_level); > > void kvm_mmu_slot_try_split_large_pages(struct kvm *kvm, > > const struct kvm_memory_slot *memslot, > > int target_level); > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 6768ef9c0891..4e78ef2dd352 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -1448,6 +1448,12 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, > > gfn_t start = slot->base_gfn + gfn_offset + __ffs(mask); > > gfn_t end = slot->base_gfn + gfn_offset + __fls(mask); > > > > + /* > > + * Try to proactively split any large pages down to 4KB so that > > + * vCPUs don't have to take write-protection faults. > > + */ > > + kvm_mmu_try_split_large_pages(kvm, slot, start, end, PG_LEVEL_4K); > > This should return a value. If splitting succeeds, there should be no hugepages > and so walking the page tables to write-protect 2M is unnecessary. Same for the > previous patch, although skipping the write-protect path is a little less > straightforward in that case. Great idea! Will do. > > > + > > kvm_mmu_slot_gfn_write_protect(kvm, slot, start, PG_LEVEL_2M); > > > > /* Cross two large pages? */