Re: [RFC] postcopy livemigration proposal

Avi Kivity <avi@xxxxxxxxxx> · Mon, 08 Aug 2011 15:38:54 +0300

On 08/08/2011 06:24 AM, Isaku Yamahata wrote:
This mail is on "Yabusame: Postcopy Live Migration for Qemu/KVM"
on which we'll give a talk at KVM-forum.
The purpose of this mail is to letting developers know it in advance
so that we can get better feedback on its design/implementation approach
early before our starting to implement it.

Interesting; what is the impact of increased latency on memory reads?

There are several design points.
   - who takes care of pulling page contents.
     an independent daemon vs a thread in qemu
     The daemon approach is preferable because an independent daemon would
     easy for debug postcopy memory mechanism without qemu.
     If required, it wouldn't be difficult to convert a daemon into
     a thread in qemu

Isn't this equivalent to touching each page in sequence?

Care must be taken that we don't post too many requests, or it could 
affect the latency of synchronous accesses by the guest.

   - connection between the source and the destination
     The connection for live migration can be re-used after sending machine
     state.

   - transfer protocol
     The existing protocol that exists today can be extended.

   - hooking guest RAM access
     Introduce a character device to handle page fault.
     When page fault occurs, it queues page request up to user space daemon
     at the destination. And the daemon pulls page contents from the source
     and serves it into the character device. Then the page fault is resovlved.

This doesn't play well with host swapping, transparent hugepages, or 
ksm, does it?

I see you note this later on.

* More on hooking guest RAM access
There are several candidate for the implementation. Our preference is
character device approach.

   - inserting hooks into everywhere in qemu/kvm
     This is impractical

   - backing store for guest ram
     a block device or a file can be used to back guest RAM.
     Thus hook the guest ram access.

     pros
     - new device driver isn't needed.
     cons
     - future improvement would be difficult
     - some KVM host feature(KSM, THP) wouldn't work

   - character device
     qemu mmap() the dedicated character device, and then hook page fault.

     pros
     - straght forward approach
     - future improvement would be easy
     cons
     - new driver is needed
     - some KVM host feature(KSM, THP) wouldn't work
       They checks if a given VMA is anonymous. This can be fixed.

   - swap device
     When creating guest, it is set up as if all the guest RAM is swapped out
     to a dedicated swap device, which may be nbd disk (or some kind of user
     space block device, BUSE?).
     When the VM tries to access memory, swap-in is triggered and IO to the
     swap device is issued. Then the IO to swap is routed to the daemon
     in user space with nbd protocol (or BUSE, AOE, iSCSI...). The daemon pulls
     pages from the migration source and services the IO request.

     pros
     - After the page transfer is complete, everything is same as normal case.
     - no new device driver isn't needed
     cons
     - future improvement would be difficult
     - administration: setting up nbd, swap device

Using a swap device would be my preference.  We'd still be using 
anonymous memory so thp/ksm/ordinary swap still work.

It would need to be a special kind of swap device since we only want to 
swap in, and never out, to that device.  We'd also need a special way of 
telling the kernel that memory comes from that device.  In that it's 
similar your second option.

Maybe we should use a backing file (using nbd) and have a madvise() call 
that converts the vma to anonymous memory once the migration is finished.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html