Hi Pasha, On 2/21/23 3:31 AM, Pasha Tatashin wrote:
As a part of an ongoing work of replacing some containerized work load with virtual machines within Google, I have worked on making the memory translations faster. I would like to propose the following topic for this year's LSF/MM/BPF: Discuss a set of techniques that can improve the guest performance, memory footprint overhead, observability, and manageability of virtual machines by hypervirtualizing the guest memory to the extreme. The end goal is to allow very lightweight virtual machines to be closer in performance to the containers. The following items are going to be discussed in this topic: - Reducing the cost of SLAT page table translations. - Reducing the memory footprint overhead. - Reducing the memory management overhead. - Increasing the observability of guest memory.
It's all about to understand the problem and possible solution or directions. I googled for 'SLAT' and direct me to x86's EPT. ARM64 has similar thing called stage-2 page table. The usual way to reduce page table translation cost is to map the contiguous memory through PUD/PMD. I'm not sure if there are other solutions we're heading for? Guest's memory is usually backed up by virtual memory area (VMA), which is either a anonymous or hugetlb region. As I understand, the page fault handling is excessive to populate the requested memory. I'm not sure if reducing the memory management overhead is to get it faster, or something else? :) Thanks, Gavin