Re: [RFC][Patch v9 0/6] KVM: Guest Free Page Hinting

Alexander Duyck <alexander.duyck@xxxxxxxxx> · Wed, 6 Mar 2019 11:24:26 -0800

On Wed, Mar 6, 2019 at 11:18 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 06.03.19 20:08, Alexander Duyck wrote:
> > On Wed, Mar 6, 2019 at 11:00 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
> >>
> >> On 06.03.19 19:43, Michael S. Tsirkin wrote:
> >>> On Wed, Mar 06, 2019 at 01:30:14PM -0500, Nitesh Narayan Lal wrote:
> >>>>>> Here are the results:
> >>>>>>
> >>>>>> Procedure: 3 Guests of size 5GB is launched on a single NUMA node with
> >>>>>> total memory of 15GB and no swap. In each of the guest, memhog is run
> >>>>>> with 5GB. Post-execution of memhog, Host memory usage is monitored by
> >>>>>> using Free command.
> >>>>>>
> >>>>>> Without Hinting:
> >>>>>>                  Time of execution    Host used memory
> >>>>>> Guest 1:        45 seconds            5.4 GB
> >>>>>> Guest 2:        45 seconds            10 GB
> >>>>>> Guest 3:        1  minute               15 GB
> >>>>>>
> >>>>>> With Hinting:
> >>>>>>                 Time of execution     Host used memory
> >>>>>> Guest 1:        49 seconds            2.4 GB
> >>>>>> Guest 2:        40 seconds            4.3 GB
> >>>>>> Guest 3:        50 seconds            6.3 GB
> >>>>> OK so no improvement.
> >>>> If we are looking in terms of memory we are getting back from the guest,
> >>>> then there is an improvement. However, if we are looking at the
> >>>> improvement in terms of time of execution of memhog then yes there is none.
> >>>
> >>> Yes but the way I see it you can't overcommit this unused memory
> >>> since guests can start using it at any time.  You timed it carefully
> >>> such that this does not happen, but what will cause this timing on real
> >>> guests?
> >>
> >> Whenever you overcommit you will need backup swap. There is no way
> >> around it. It just makes the probability of you having to go to disk
> >> less likely.
> >>
> >> If you assume that all of your guests will be using all of their memory
> >> all the time, you don't have to think about overcommiting memory in the
> >> first place. But this is not what we usually have.
> >
> > Right, but the general idea is that free page hinting allows us to
> > avoid having to use the swap if we are hinting the pages as unused.
> > The general assumption we are working with is that some percentage of
> > the VMs are unused most of the time so you can share those resources
> > between multiple VMs and have them free those up normally.
>
> Yes, similar to VCPU yielding or playin scheduling when the VCPU is
> spleeping. Instead of busy looping, hand over the resource to somebody
> who can actually make use of it.
>
> >
> > If we can reduce swap usage we can improve overall performance and
> > that was what I was pointing out with my test. I had also done
> > something similar to what Nitesh was doing with his original test
> > where I had launched 8 VMs with 8GB of memory per VM on a system with
> > 32G of RAM and only 4G of swap. In that setup I could keep a couple
> > VMs busy at a time without issues, and obviously without the patch I
> > just started to OOM qemu instances and  could only have 4 VMs at a
> > time running at maximum.
>
> While these are nice experiments (especially to showcase reduced swap
> usage!), I would not suggest to use 4GB of swap on a x2 overcomited
> system (32GB overcommited). Disks are so cheap nowadays that one does
> not have to play with fire.

Right. The only reason for using 4G is because the system normally has
128G of RAM available and I didn't really think I would need swap for
the system when I originally configured it.

> But yes, reducing swap usage implies overall system performance (unless
> the hinting is terribly slow :) ). Reducing swap usage, not swap space :)

Right. Also the swap is really a necessity if we are going to look at
things like MADV_FREE as I have not seen us really start to free up
resources until we are starting to put some pressure on swap.