Re: [PATCH RFC] migration: make sure to run iterate precopy during the bulk stage

Juan Quintela <quintela@xxxxxxxxxx> · Tue, 04 Sep 2018 11:12:18 +0200

Quan Xu <quan.xu0@xxxxxxxxx> wrote:
> From 8dbf7370e7ea1caab0b769d0d4dcdd072d14d421 Mon Sep 17 00:00:00 2001
> From: Quan Xu <quan.xu0@xxxxxxxxx>
> Date: Wed, 29 Aug 2018 21:33:14 +0800
> Subject: [PATCH RFC] migration: make sure to run iterate precopy during the
>  bulk stage
>
> Since the bulk stage assumes in (migration_bitmap_find_dirty) that every
> page is dirty, return a rough total ram as pending size to make sure that
> migration thread continues to run iterate precopy during the bulk stage.
>
> Otherwise the downtime grows unpredictably, as migration thread needs to
> send both the rest of pages and dirty pages during complete precopy.
>
> Signed-off-by: Quan Xu <quan.xu0@xxxxxxxxx>
> ---
>  migration/ram.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/migration/ram.c b/migration/ram.c
> index 79c8942..cfa304c 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -3308,7 +3308,8 @@ static void ram_save_pending(QEMUFile *f, void
> *opaque, uint64_t max_size,
>          /* We can do postcopy, and all the data is postcopiable */
>          *res_compatible += remaining_size;
>      } else {
> -        *res_precopy_only += remaining_size;
> +        *res_precopy_only += (rs->ram_bulk_stage ?
> +                              ram_bytes_total() : remaining_size);
>      }
>  }

Hi

I don't oppose the change.
But what I don't understand is _why_ it is needed (or to say it
otherwise, how it worked until now).   I was wondering about the opposit
direction, and just initialize the number of dirty pages at the
beggining of the loop and then let decrease it for each processed page.

I don't remember either how big was the speedud of not walking the
bitmap on the 1st stage to start with.

Later, Juan.