On Wed, Mar 6, 2019 at 11:18 AM David Hildenbrand <david@xxxxxxxxxx> wrote: > > On 06.03.19 20:08, Alexander Duyck wrote: > > On Wed, Mar 6, 2019 at 11:00 AM David Hildenbrand <david@xxxxxxxxxx> wrote: > >> > >> On 06.03.19 19:43, Michael S. Tsirkin wrote: > >>> On Wed, Mar 06, 2019 at 01:30:14PM -0500, Nitesh Narayan Lal wrote: > >>>>>> Here are the results: > >>>>>> > >>>>>> Procedure: 3 Guests of size 5GB is launched on a single NUMA node with > >>>>>> total memory of 15GB and no swap. In each of the guest, memhog is run > >>>>>> with 5GB. Post-execution of memhog, Host memory usage is monitored by > >>>>>> using Free command. > >>>>>> > >>>>>> Without Hinting: > >>>>>> Time of execution Host used memory > >>>>>> Guest 1: 45 seconds 5.4 GB > >>>>>> Guest 2: 45 seconds 10 GB > >>>>>> Guest 3: 1 minute 15 GB > >>>>>> > >>>>>> With Hinting: > >>>>>> Time of execution Host used memory > >>>>>> Guest 1: 49 seconds 2.4 GB > >>>>>> Guest 2: 40 seconds 4.3 GB > >>>>>> Guest 3: 50 seconds 6.3 GB > >>>>> OK so no improvement. > >>>> If we are looking in terms of memory we are getting back from the guest, > >>>> then there is an improvement. However, if we are looking at the > >>>> improvement in terms of time of execution of memhog then yes there is none. > >>> > >>> Yes but the way I see it you can't overcommit this unused memory > >>> since guests can start using it at any time. You timed it carefully > >>> such that this does not happen, but what will cause this timing on real > >>> guests? > >> > >> Whenever you overcommit you will need backup swap. There is no way > >> around it. It just makes the probability of you having to go to disk > >> less likely. > >> > >> If you assume that all of your guests will be using all of their memory > >> all the time, you don't have to think about overcommiting memory in the > >> first place. But this is not what we usually have. > > > > Right, but the general idea is that free page hinting allows us to > > avoid having to use the swap if we are hinting the pages as unused. > > The general assumption we are working with is that some percentage of > > the VMs are unused most of the time so you can share those resources > > between multiple VMs and have them free those up normally. > > Yes, similar to VCPU yielding or playin scheduling when the VCPU is > spleeping. Instead of busy looping, hand over the resource to somebody > who can actually make use of it. > > > > > If we can reduce swap usage we can improve overall performance and > > that was what I was pointing out with my test. I had also done > > something similar to what Nitesh was doing with his original test > > where I had launched 8 VMs with 8GB of memory per VM on a system with > > 32G of RAM and only 4G of swap. In that setup I could keep a couple > > VMs busy at a time without issues, and obviously without the patch I > > just started to OOM qemu instances and could only have 4 VMs at a > > time running at maximum. > > While these are nice experiments (especially to showcase reduced swap > usage!), I would not suggest to use 4GB of swap on a x2 overcomited > system (32GB overcommited). Disks are so cheap nowadays that one does > not have to play with fire. Right. The only reason for using 4G is because the system normally has 128G of RAM available and I didn't really think I would need swap for the system when I originally configured it. > But yes, reducing swap usage implies overall system performance (unless > the hinting is terribly slow :) ). Reducing swap usage, not swap space :) Right. Also the swap is really a necessity if we are going to look at things like MADV_FREE as I have not seen us really start to free up resources until we are starting to put some pressure on swap.