Re: [RFC PATCH 00/15] KVM: x86/mmu: Eager Page Splitting for the TDP MMU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 26, 2021 at 6:13 AM Peter Xu <peterx@xxxxxxxxxx> wrote:
>
> Hi, David,
>
> On Fri, Nov 19, 2021 at 11:57:44PM +0000, David Matlack wrote:
> > This series is a first pass at implementing Eager Page Splitting for the
> > TDP MMU. For context on the motivation and design of Eager Page
> > Splitting, please see the RFC design proposal and discussion [1].
> >
> > Paolo, I went ahead and added splitting in both the intially-all-set
> > case (only splitting the region passed to CLEAR_DIRTY_LOG) and the
> > case where we are not using initially-all-set (splitting the entire
> > memslot when dirty logging is enabled) to give you an idea of what
> > both look like.
> >
> > Note: I will be on vacation all of next week so I will not be able to
> > respond to reviews until Monday November 29. I thought it would be
> > useful to seed discussion and reviews with an early version of the code
> > rather than putting it off another week. But feel free to also ignore
> > this until I get back :)
> >
> > This series compiles and passes the most basic splitting test:
> >
> > $ ./dirty_log_perf_test -s anonymous_hugetlb_2mb -v 2 -i 4
> >
> > But please operate under the assumption that this code is probably
> > buggy.
> >
> > [1] https://lore.kernel.org/kvm/CALzav=dV_U4r1K9oDq4esb4mpBQDQ2ROQ5zH5wV3KpOaZrRW-A@xxxxxxxxxxxxxx/#t
>
> Will there be more numbers to show in the formal patchset?

Yes definitely. I didn't have a lot of time to test this series, hence
the RFC status. I'll include more thorough testing and performance
evaluation in the cover letter for v1.


> It's interesting to
> know how "First Pass Dirty Memory Time" will change comparing to the rfc
> numbers; I can have a feel of it, but still. :) Also, not only how it speedup
> guest dirty apps, but also some general measurement on how it slows down
> KVM_SET_USER_MEMORY_REGION (!init-all-set) or CLEAR_LOG (init-all-set) would be
> even nicer (for CLEAR, I guess the 1st/2nd+ round will have different overhead).
>
> Besides that, I'm also wondering whether we should still have a knob for it, as
> I'm wondering what if the use case is the kind where eager split huge page may
> not help at all.  What I'm thinking:
>
>   - Read-mostly guest overload; split huge page will speed up rare writes, but
>     at the meantime drag readers down due to huge->small page mappings.
>
>   - Writes-over-very-limited-region workload: say we have 1T guest and the app
>     in the guest only writes 10G part of it.  Hmm not sure whether it exists..
>
>   - Postcopy targeted: it means precopy may only run a few iterations just to
>     send the static pages, so the migration duration will be relatively short,
>     and the write just didn't spread a lot to the whole guest mem.
>
> I don't really think any of the example is strong enough as they're all very
> corner cased, but just to show what I meant to raise this question on whether
> unconditionally eager split is the best approach.

I'd be happy to add a knob if there's a userspace that wants to use
it. I think the main challenge though is knowing when it is safe to
disable eager splitting. For a small deployment where you know the VM
workload, it might make sense. But for a public cloud provider the
only feasible way would be to dynamically monitor the guest writing
patterns. But then we're back at square one because that would require
dirty logging. And even then, there's no guaranteed way to predict
future guest write patterns based on past patterns.

The way forward here might be to do a hybrid of 2M and 4K dirty
tracking (and maybe even 1G). For example, first start dirty logging
at 2M granularity, and then log at 4K for any specific regions or
memslots that aren't making progress. We'd still use Eager Page
Splitting unconditionally though, first to split to 2M and then to
split to 4K.

>
> Thanks,
>
> --
> Peter Xu
>



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux