Re: [Qemu-devel] Memory sync algorithm during migration

Pierre Riteau <Pierre.Riteau@xxxxxxxx> · Tue, 22 Nov 2011 17:44:42 +0100

On 22 nov. 2011, at 14:04, Oliver Hookins wrote:

> On Tue, Nov 22, 2011 at 10:31:58AM +0100, ext Juan Quintela wrote:
>> Oliver Hookins <oliver.hookins@xxxxxxxxx> wrote:
>>> On Tue, Nov 15, 2011 at 11:47:58AM +0100, ext Juan Quintela wrote:
>>>> Takuya Yoshikawa <yoshikawa.takuya@xxxxxxxxxxxxx> wrote:
>>>>> Adding qemu-devel ML to CC.
>>>>> 
>>>>> Your question should have been sent to qemu-devel ML because the logic
>>>>> is implemented in QEMU, not KVM.
>>>>> 
>>>>> (2011/11/11 1:35), Oliver Hookins wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> I am performing some benchmarks on KVM migration on two different types of VM.
>>>>>> One has 4GB RAM and the other 32GB. More or less idle, the 4GB VM takes about 20
>>>>>> seconds to migrate on our hardware while the 32GB VM takes about a minute.
>>>>>> 
>>>>>> With a reasonable amount of memory activity going on (in the hundreds of MB per
>>>>>> second) the 32GB VM takes 3.5 minutes to migrate, but the 4GB VM never
>>>>>> completes. Intuitively this tells me there is some watermarking of dirty pages
>>>>>> going on that is not particularly efficient when the dirty pages ratio is high
>>>>>> compared to total memory, but I may be completely incorrect.
>>>> 
>>>>> You can change the ratio IIRC.
>>>>> Hopefully, someone who knows well about QEMU will tell you better ways.
>>>>> 
>>>>> 	Takuya
>>>>> 
>>>>>> 
>>>>>> Could anybody fill me in on what might be going on here? We're using libvirt
>>>>>> 0.8.2 and kvm-83-224.el5.centos.1
>>>> 
>>>> This is pretty old qemu/kvm code base.
>>>> In principle, it makes no sense that with 32GB RAM migration finishes,
>>>> and with 4GB RAM it is unable (intuitively it should be, if ever, the
>>>> other way around).
>>>> 
>>>> Do you have an easy test that makes the problem easily reproducible?
>>>> Have you tried ustream qemu.git? (some improvements on that department).
>>> 
>>> I've just tried the qemu-kvm 0.14.1 tag which seems to be the latest that builds
>>> on my platform. For some strange reason migrations always seem to fail in one
>>> direction with "Unknown savevm section or instance 'hpet' 0" messages.
>> 
>> What is your platform?  This seems like you are running with hpet in one
>> side, but without it in the other.  What command line are you using?
> 
> Yes, my mistake. We were also testing later kernels and my test machines managed
> to get out of sync. One had support for hpet clocksource but the other one
> didn't.
> 
>> 
>>> This seems to point to different migration protocols on either end but they are
>>> both running the same version of qemu-kvm I built. Does this ring any bells for
>>> anyone?
>> 
>> Command line mismatch.  But, what is your platform?
> 
> CentOS5.6. Now running the VMs through qemu-kvm 0.14.1, unloaded migrations take
> about half the time but with memory I/O load now both VMs never complete the
> migration. In practical terms I'm writing about 50MB/s into memory and we have a
> 10Gbps network (and I've seen real speeds up to 8-9Gbps on the wire) so there
> should be enough capacity to sync up the dirty pages.
> 
> So now the 32GB and 4GB VMs have matching behaviour (which makes more sense) but
> I'm not any closer to figuring out what is going on.

You say you write 50 MB/s in memory, but this does not provide enough information to analyze the problem.
How distributed in memory are these writes? If your writes are not restricted to a small memory region, they could dirty many pages. In this case, live migration would have to transfer much more than 50 MB/s of pages to the destination.

-- 
Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
http://perso.univ-rennes1.fr/pierre.riteau/

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html