On Tue, Nov 30, 2021 at 4:16 PM David Matlack <dmatlack@xxxxxxxxxx> wrote: > > On Fri, Nov 26, 2021 at 4:17 AM Peter Xu <peterx@xxxxxxxxxx> wrote: > > > > On Fri, Nov 19, 2021 at 11:57:57PM +0000, David Matlack wrote: > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > > index 6768ef9c0891..4e78ef2dd352 100644 > > > --- a/arch/x86/kvm/mmu/mmu.c > > > +++ b/arch/x86/kvm/mmu/mmu.c > > > @@ -1448,6 +1448,12 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, > > > gfn_t start = slot->base_gfn + gfn_offset + __ffs(mask); > > > gfn_t end = slot->base_gfn + gfn_offset + __fls(mask); > > > > > > + /* > > > + * Try to proactively split any large pages down to 4KB so that > > > + * vCPUs don't have to take write-protection faults. > > > + */ > > > + kvm_mmu_try_split_large_pages(kvm, slot, start, end, PG_LEVEL_4K); > > > + > > > kvm_mmu_slot_gfn_write_protect(kvm, slot, start, PG_LEVEL_2M); > > > > > > /* Cross two large pages? */ > > > > Is it intended to try split every time even if we could have split it already? > > As I remember Paolo mentioned that we can skip split if it's not the 1st > > CLEAR_LOG on the same range, and IIUC that makes sense. > > > > But indeed I don't see a trivial way to know whether this is the first clear of > > this range. Maybe we can maintain "how many huge pages are there under current > > kvm_mmu_page node" somehow? Then if root sp has the counter==0, then we can > > skip it. Just a wild idea.. > > > > Or maybe it's intended to try split unconditionally for some reason? If so > > it'll be great to mention that either in the commit message or in comments. > > Thanks for calling this out. Could the same be said about the existing > code that unconditionally tries to write-protect 2M+ pages? I aimed to > keep parity with the write-protection calls (always try to split > before write-protecting) but I agree there might be opportunities > available to skip altogether. > > By the way, looking at this code again I think I see some potential bugs: > - I don't think I ever free split_caches in the initially-all-set case. > - What happens if splitting fails the CLEAR_LOG but succeeds the > CLEAR_LOG? Gah, meant to say "first CLEAR_LOG" and "second CLEAR_LOG" here. > We would end up propagating the write-protection on the 2M > page down to the 4K page. This might cause issues if using PML. > > > > > Thanks, > > > > -- > > Peter Xu > >