Re: [PATCH v3 00/35] postcopy live migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Isaku,


Are you going to be at the KVM forum ( i think you have a presentation
there). It would be nice if we could meet in order to see if we can
synch our efforts .

As you know we have been developing an RDMA based solution for post
copy migration and  we demonstrated the initial proof of concept in
december 2012 ( we published some finding  in VHPC 2012 and are
working with Petter Svard from Umea on a journal paper with more
detailed performance review) .

While  RDMA post copy live migration is just of by product of our long
term effort ( i will present the project  in my talk at KVM forum)  we
grabbed the opportunity  to address problems we were facing with the
live migration of enterprise workload . Namely how to migrate in
memory database such has HANA under load.

We quickly discovered that pre copy ( even with optimization ) didn't
work with such workload. We also tried your code however the
performance where far from satisfying with large VM under load due to
the heavy cost of transferring memory between user space - kernel
multiple time ( actually it often failed)

We then tested a   pure RDMA solution we developed  ( we suport HW and
software RDMA )   and it work fine with all the  workload we tested  (
we migrated VM with 20+ GB running SAP HANA under a workload similar
to TPC-H) and we hop to test with bigger configuration soon ( 1/2 + TB
of memory) .

However the state of integration of our code with the QEMU -code base
is not as advanced and polished as the one you currently have and i
would like to know if you would be interested in trying to join our
effort or collaborate in merging our solution. Or maybe allowing us to
piggy back on your effort.

Would you bee free to meet at any time next week ? ( from Tuesday to Friday)

Ps: we would be open sourcing our project by the end of the month of
November and the post copy is only a small part of the technology
developed..


Regards
Benoit


On 30 October 2012 08:32, Isaku Yamahata <yamahata@xxxxxxxxxxxxx> wrote:
>
> This is the v3 patch series of postcopy migration.
>
> The trees is available at
> git://github.com/yamahata/qemu.git qemu-postcopy-oct-30-2012
> git://github.com/yamahata/linux-umem.git linux-umem-oct-29-2012
>
> Major changes v2 -> v3:
> - implemented pre+post optimization
> - auto detection of postcopy by incoming side
> - using threads on destination instead of fork
> - using blocking io instead of select + non-blocking io loop
> - less memory overhead
> - various improvement and code simplification
> - kernel module name change umem -> uvmem to avoid name conflict.
>
> Patches organization:
> 1-2: trivial fixes
> 3-5: prepartion for threading. cherry-picked from migration tree
> 6-18: refactoring existing code and preparation
> 19-25: implement postcopy live migration itself (essential part)
> 26-35: optimization/heuristic for postcopy
>
> Usage
> =====
> You need load uvmem character device on the host before starting
> migration.
> Postcopy can be used for tcg and kvm accelarator. The implementation
> depend
> on only linux uvmem character device. But the driver dependent code is
> split
> into a file.
> I tested only host page size == guest page size case, but the
> implementation
> allows host page size != guest page size case.
>
> The following options are added with this patch series.
> - incoming part
>   use -incoming as usual. Postcopy is automatically detected.
>   example:
>   qemu -incoming tcp:0:4444 -monitor stdio -machine accel=kvm
>
> - outging part
>   options for migrate command
>   migrate [-p [-n] [-m]] URI
>           [<precopy count> [<prefault forward> [<prefault backword>]]]
>
>   Newly added options/arguments
>   -p: indicate postcopy migration
>   -n: disable background transferring pages: This is for
> benchmark/debugging
>   -m: move background transfer of postcopy mode
>   <precopy count>: The number of precopy RAM scan before postcopy.
>                    default 0 (0 means no precopy)
>   <prefault forward>: The number of forward pages which is sent with
> on-demand
>   <prefault backward>: The number of backward pages which is sent with
>                        on-demand
>
>   example:
>   migrate -p -n tcp:<dest ip address>:4444
>   migrate -p -n -m tcp:<dest ip address>:4444 42 42 0
>
>
> TODO
> ====
> - benchmark/evaluation
> - improve/optimization
>   At the moment at least what I'm aware of is
>   - pre+post case
>     On desitnation side reading dirty bitmap would cause long latency.
>     create thread for that.
> - consider on FUSE/CUSE possibility
>
> basic postcopy work flow
> ========================
>         qemu on the destination
>               |
>               V
>         open(/dev/uvmem)
>               |
>               V
>         UVMEM_INIT
>               |
>               V
>         Here we have two file descriptors to
>         umem device and shmem file
>               |
>               |                                  umem threads
>               |                                  on the destination
>               |
>               V    create pipe to communicate
>         crete threads--------------------------------,
>               |                                      |
>               V                                   mmap(shmem file)
>         mmap(uvmem device) for guest RAM          close(shmem file)
>               |                                      |
>               |                                      |
>               V                                      |
>         wait for ready from daemon <----pipe-----send ready message
>               |                                      |
>               |                                 Here the daemon takes over
>         send ok------------pipe---------------> the owner of the socket
>               |                                 to the source
>               V                                      |
>         entering post copy stage                     |
>         start guest execution                        |
>               |                                      |
>               V                                      V
>         access guest RAM                          read() to get faulted
> pages
>               |                                      |
>               V                                      V
>         page fault ------------------------------>page offset is returned
>         block                                        |
>                                                      V
>                                                   pull page from the
> source
>                                                   write the page contents
>                                                   to the shmem.
>                                                      |
>                                                      V
>         unblock     <-----------------------------write() to tell served
> pages
>         the fault handler returns the page           |
>         page fault is resolved                       |
>               |                                      V
>               |                                   touch guest RAM pages
>               |                                      |
>               |                                      V
>               |                                   release the cached page
>               |                                   madvise(MADV_REMOVE)
>               |
>               |
>               |                                   pages can be sent
>               |                                   backgroundly
>               |                                      |
>               |                                      V
>               |                                   mark page is cached
>               |                                   Thus future page fault
> is
>               |                                   avoided.
>               |                                      |
>               |                                      V
>               |                                   touch guest RAM pages
>               |                                      |
>               |                                      V
>               |                                   release the cached page
>               |                                   madvise(MADV_REMOVE)
>               |                                      |
>               V                                      V
>
>                  all the pages are pulled from the source
>
>               |                                      |
>               V                                      V
>         migration completes                        exit()
>
>
> Isaku Yamahata (32):
>   migration.c: remove redundant line in migrate_init()
>   arch_init: DPRINTF format error and typo
>   osdep: add qemu_read_full() to read interrupt-safely
>   savevm: export qemu_peek_buffer, qemu_peek_byte, qemu_file_skip,
>     qemu_fflush
>   savevm/QEMUFile: consolidate QEMUFile functions a bit
>   savevm/QEMUFile: introduce qemu_fopen_fd
>   savevm/QEMUFile: add read/write QEMUFile on memory buffer
>   savevm, buffered_file: introduce method to drain buffer of buffered
>     file
>   arch_init: export RAM_SAVE_xxx flags for postcopy
>   arch_init/ram_save: introduce constant for ram save version = 4
>   arch_init: refactor ram_save_block() and export ram_save_block()
>   arch_init/ram_save_setup: factor out bitmap alloc/free
>   arch_init/ram_load: refactor ram_load
>   arch_init: factor out logic to find ram block with id string
>   migration: export migrate_fd_completed() and migrate_fd_cleanup()
>   uvmem.h: import Linux uvmem.h and teach update-linux-headers.sh
>   osdep: add QEMU_MADV_REMOVE and tirivial fix
>   postcopy: introduce helper functions for postcopy
>   savevm: add new section that is used by postcopy
>   postcopy: implement incoming part of postcopy live migration
>   postcopy outgoing: add -p option to migrate command
>   postcopy: implement outgoing part of postcopy live migration
>   postcopy/outgoing: add -n options to disable background transfer
>   postcopy/outgoing: implement forward/backword prefault
>   arch_init: factor out setting last_block, last_offset
>   postcopy/outgoing: add movebg mode(-m) to migration command
>   arch_init: factor out ram_load
>   arch_init: export ram_save_iterate()
>   postcopy: pre+post optimization incoming side
>   arch_init: export migration_bitmap_sync and helper method to get
>     bitmap
>   postcopy/outgoing: introduce precopy_count parameter
>   postcopy: pre+post optimization outgoing side
>
> Paolo Bonzini (1):
>   split MRU ram list
>
> Umesh Deshpande (2):
>   add a version number to ram_list
>   protect the ramlist with a separate mutex
>
>  Makefile.target                 |    2 +
>  arch_init.c                     |  391 +++++---
>  arch_init.h                     |   24 +
>  buffered_file.c                 |   59 +-
>  buffered_file.h                 |    1 +
>  cpu-all.h                       |   16 +-
>  exec.c                          |   62 +-
>  hmp-commands.hx                 |   21 +-
>  hmp.c                           |   12 +-
>  linux-headers/linux/uvmem.h     |   41 +
>  migration-exec.c                |    8 +-
>  migration-fd.c                  |   23 +-
>  migration-postcopy.c            | 2019
> +++++++++++++++++++++++++++++++++++++++
>  migration-tcp.c                 |   16 +-
>  migration-unix.c                |   36 +-
>  migration.c                     |   65 +-
>  migration.h                     |   42 +
>  osdep.c                         |   24 +
>  osdep.h                         |   13 +-
>  qapi-schema.json                |    6 +-
>  qemu-common.h                   |    2 +
>  qemu-file.h                     |   12 +-
>  qmp-commands.hx                 |    4 +-
>  savevm.c                        |  223 ++++-
>  scripts/update-linux-headers.sh |    2 +-
>  sysemu.h                        |    2 +-
>  umem.c                          |  291 ++++++
>  umem.h                          |   88 ++
>  vl.c                            |    5 +-
>  29 files changed, 3265 insertions(+), 245 deletions(-)
>  create mode 100644 linux-headers/linux/uvmem.h
>  create mode 100644 migration-postcopy.c
>  create mode 100644 umem.c
>  create mode 100644 umem.h
>
> --
> 1.7.10.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
" The production of too many useful things results in too many useless
people"
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux