On 07/12/2017 08:40 PM, Wei Wang wrote:
Add a new feature, VIRTIO_BALLOON_F_SG, which enables to transfer a chunk of ballooned (i.e. inflated/deflated) pages using scatter-gather lists to the host. The implementation of the previous virtio-balloon is not very efficient, because the balloon pages are transferred to the host one by one. Here is the breakdown of the time in percentage spent on each step of the balloon inflating process (inflating 7GB of an 8GB idle guest). 1) allocating pages (6.5%) 2) sending PFNs to host (68.3%) 3) address translation (6.1%) 4) madvise (19%) It takes about 4126ms for the inflating process to complete. The above profiling shows that the bottlenecks are stage 2) and stage 4). This patch optimizes step 2) by transferring pages to the host in sgs. An sg describes a chunk of guest physically continuous pages. With this mechanism, step 4) can also be optimized by doing address translation and madvise() in chunks rather than page by page. With this new feature, the above ballooning process takes ~491ms resulting in an improvement of ~88%.
I found a recent mm patch, bb01b64cfab7c22f3848cb73dc0c2b46b8d38499 , zeros all the ballooned pages, which is very time consuming. Tests show that the time to balloon 7G pages is increased from ~491 ms to 2.8 seconds with the above patch. How about moving the zero operation to the hypervisor? In this way, we will have a much faster balloon process. Best, Wei