RE: migrate_set_downtime bug

Dietmar Maurer <dietmar@xxxxxxxxxxx> · Wed, 30 Sep 2009 10:55:24 +0200

Another problem occur when max_downtime is too short. This can results in never ending migration task.

To reproduce just play a video inside a VM and set max_downtime to 30ns

Sure, one can argument that this behavior is expected.

But the following would avoid the problem:

+    if ((stage == 2) && (bytes_transferred > 2*ram_bytes_total())) {
+        return 1;
+    }

Or do you think that is not reasonable?

- Dietmar

> -----Original Message-----
> From: Glauber Costa [mailto:glommer@xxxxxxxxxx]
> Sent: Mittwoch, 30. September 2009 06:49
> To: Dietmar Maurer
> Cc: Anthony Liguori; kvm
> Subject: Re: migrate_set_downtime bug
> 
> On Tue, Sep 29, 2009 at 06:36:57PM +0200, Dietmar Maurer wrote:
> > > Also, if this is really the case (buffered), then the bandwidth
> capping
> > > part
> > > of migration is also wrong.
> > >
> > > Have you compared the reported bandwidth to your actual bandwith ?
> I
> > > suspect
> > > the source of the problem can be that we're currently ignoring the
> time
> > > we take
> > > to transfer the state of the devices, and maybe it is not
> negligible.
> > >
> >
> > I have a 1GB network (e1000 card), and get values like bwidth=0.98 -
> which is much too high.
> The main reason for not using the whole migration time is that it can
> lead to values
> that are not very helpful in situation where the network load changes
> too much.
> 
> Since the problem you pinpointed do exist, I would suggest measuring
> the average load of the last,
> say, 10 iterations. How would that work for you?

Attachment:
migrate.diff

Description: migrate.diff