Re: [PATCH v2 8/8] migration: do not flush_compressed_data at the end of each iteration

Xiao Guangrong <guangrong.xiao@xxxxxxxxx> · Mon, 23 Jul 2018 16:05:21 +0800

On 07/23/2018 01:49 PM, Peter Xu wrote:
On Thu, Jul 19, 2018 at 08:15:20PM +0800, guangrong.xiao@xxxxxxxxx wrote:
From: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxx>

flush_compressed_data() needs to wait all compression threads to
finish their work, after that all threads are free until the
migration feeds new request to them, reducing its call can improve
the throughput and use CPU resource more effectively

We do not need to flush all threads at the end of iteration, the
data can be kept locally until the memory block is changed or
memory migration starts over in that case we will meet a dirtied
page which may still exists in compression threads's ring

Signed-off-by: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxx>
---
  migration/ram.c | 15 ++++++++++++++-
  1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index 89305c7af5..fdab13821d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -315,6 +315,8 @@ struct RAMState {
      uint64_t iterations;
      /* number of dirty bits in the bitmap */
      uint64_t migration_dirty_pages;
+    /* last dirty_sync_count we have seen */
+    uint64_t dirty_sync_count;

Better suffix it with "_prev" as well?  So that we can quickly
identify that it's only a cache and it can be different from the one
in the ram_counters.

Indeed, will update it.


      /* protects modification of the bitmap */
      QemuMutex bitmap_mutex;
      /* The RAMBlock used in the last src_page_requests */
@@ -2532,6 +2534,7 @@ static void ram_save_cleanup(void *opaque)
      }
  
      xbzrle_cleanup();
+    flush_compressed_data(*rsp);

Could I ask why do we need this considering that we have
compress_threads_save_cleanup() right down there?

Dave ask it too. :(

"This is for the error condition, if any error occurred during live migration,
there is no chance to call ram_save_complete. After using the lockless
multithreads model, we assert all requests have been handled before destroy
the work threads."

That makes sure there is nothing left in the threads before doing
compress_threads_save_cleanup() as current behavior. For lockless
mutilthread model, we check if all requests are free before destroy
them.


      compress_threads_save_cleanup();
      ram_state_cleanup(rsp);
  }
@@ -3203,6 +3206,17 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
  
      ram_control_before_iterate(f, RAM_CONTROL_ROUND);
  
+    /*
+     * if memory migration starts over, we will meet a dirtied page which
+     * may still exists in compression threads's ring, so we should flush
+     * the compressed data to make sure the new page is not overwritten by
+     * the old one in the destination.
+     */
+    if (ram_counters.dirty_sync_count != rs->dirty_sync_count) {
+        rs->dirty_sync_count = ram_counters.dirty_sync_count;
+        flush_compressed_data(rs);
+    }
+
      t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
      i = 0;
      while ((ret = qemu_file_rate_limit(f)) == 0 ||
@@ -3235,7 +3249,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
          }
          i++;
      }
-    flush_compressed_data(rs);

This looks sane to me, but I'd like to see how other people would
think about it too...

Thank you a lot, Peter! :)