On 03/21/2018 04:19 PM, Peter Xu wrote:
On Fri, Mar 16, 2018 at 04:05:14PM +0800, Xiao Guangrong wrote:
Hi David,
Thanks for your review.
On 03/15/2018 06:25 PM, Dr. David Alan Gilbert wrote:
migration/ram.c | 32 ++++++++++++++++----------------
Hi,
Do you have some performance numbers to show this helps? Were those
taken on a normal system or were they taken with one of the compression
accelerators (which I think the compression migration was designed for)?
Yes, i have tested it on my desktop, i7-4790 + 16G, by locally live migrate
the VM which has 8 vCPUs + 6G memory and the max-bandwidth is limited to 350.
During the migration, a workload which has 8 threads repeatedly written total
6G memory in the VM. Before this patchset, its bandwidth is ~25 mbps, after
applying, the bandwidth is ~50 mbps.
Hi, Guangrong,
Not really review comments, but I got some questions. :)
Your comments are always valuable to me! :)
IIUC this patch will only change the behavior when last_sent_block
changed. I see that the performance is doubled after the change,
which is really promising. However I don't fully understand why it
brings such a big difference considering that IMHO current code is
sending dirty pages per-RAMBlock. I mean, IMHO last_sent_block should
not change frequently? Or am I wrong?
It's depends on the configuration, each memory-region which is ram or
file backend has a RAMBlock.
Actually, more benefits comes from the fact that the performance & throughput
of the multithreads has been improved as the threads is fed by the
migration thread and the result is consumed by the migration
thread.
Another follow-up question would be: have you measured how long time
needed to compress a 4k page, and how many time to send it? I think
"sending the page" is not really meaningful considering that we just
put a page into the buffer (which should be extremely fast since we
don't really flush it every time), however I would be curious on how
slow would compressing a page be.
I haven't benchmark the performance of zlib, i think it is CPU intensive
workload, particularly, there no compression-accelerator (e.g, QAT) on
our production. BTW, we were using lzo instead of zlib which worked
better for some workload.
Putting a page into buffer should depend on the network, i,e, if the
network is congested it should take long time. :)