Another problem occur when max_downtime is too short. This can results in never ending migration task. To reproduce just play a video inside a VM and set max_downtime to 30ns Sure, one can argument that this behavior is expected. But the following would avoid the problem: + if ((stage == 2) && (bytes_transferred > 2*ram_bytes_total())) { + return 1; + } Or do you think that is not reasonable? - Dietmar > -----Original Message----- > From: Glauber Costa [mailto:glommer@xxxxxxxxxx] > Sent: Mittwoch, 30. September 2009 06:49 > To: Dietmar Maurer > Cc: Anthony Liguori; kvm > Subject: Re: migrate_set_downtime bug > > On Tue, Sep 29, 2009 at 06:36:57PM +0200, Dietmar Maurer wrote: > > > Also, if this is really the case (buffered), then the bandwidth > capping > > > part > > > of migration is also wrong. > > > > > > Have you compared the reported bandwidth to your actual bandwith ? > I > > > suspect > > > the source of the problem can be that we're currently ignoring the > time > > > we take > > > to transfer the state of the devices, and maybe it is not > negligible. > > > > > > > I have a 1GB network (e1000 card), and get values like bwidth=0.98 - > which is much too high. > The main reason for not using the whole migration time is that it can > lead to values > that are not very helpful in situation where the network load changes > too much. > > Since the problem you pinpointed do exist, I would suggest measuring > the average load of the last, > say, 10 iterations. How would that work for you?
Attachment:
migrate.diff
Description: migrate.diff