Re: [LSF/MM/BPF TOPIC] VM Memory Overcommit

David Hildenbrand <david@xxxxxxxxxx> · Tue, 28 Feb 2023 10:20:57 +0100

On 23.02.23 00:59, T.J. Alumbaugh wrote:
Hi,

This topic proposal would be to present and discuss multiple MM
features to improve host memory overcommit while running VMs. There
are two general cases:

1. The host and its guests operate independently,

2. The host and its guests cooperate by techniques like ballooning.

In the first case, we would discuss some new techniques, e.g., fast
access bit harvesting in the KVM MMU, and some difficulties, e.g.,
double zswapping.

In the second case, we would like to discuss a novel working set size
(WSS) notifier framework and some improvements to the ballooning
policy. The WSS notifier, when available, can report WSS to its
listeners. VM Memory Overcommit is one of its use cases: the
virtio-balloon driver can register for WSS notifications and relay WSS
to the host. The host can leverage the WSS notifications and improve
the ballooning policy.

This topic would be of interest to a wide range of audience, e.g.,
phones, laptops and servers.
Co-presented with Yuanchu Xie.

In general, having the WSS available to the hypervisor might be 
beneficial. I recall, that there was an idea to leverage MGLRU and to 
communicate MGLRU statistics to the hypervisor, such that the hypervisor 
can make decisions using these statistics.

But note that I don't think that the future will be traditional memory 
balloon inflation/deflation. I think it might be useful in related 
context, though.

What we actually might want is a way to tell the OS ruining inside the 
VM to "please try not using more than XXX MiB of physical memory" but 
treat it as a soft limit. So in case we mess up, or there is a sudden 
peak in memory consumption due to a workload, we won't harm the guest 
OS/workload, and don't have to act immediately to avoid trouble. One can 
think of it like an evolution of memory ballooning: instead of creating 
artificial memory pressure by inflating the balloon that is fairly event 
driven and requires explicit memory deflation, we teach the OS to do it 
natively and pair it with free page reporting.

All free physical memory inside the VM can be reported using free page 
reporting to the hypervisor, and the OS will try sticking to the 
requested "logical" VM size, unless there is real demand for more memory.

--
Thanks,

David / dhildenb