On Mon, Jun 04, 2012 at 05:01:30AM -0700, Chegu Vinod wrote: > Hello Isaku Yamahata, Hi. > I just saw your patches..Would it be possible to email me a tar bundle of these > patches (makes it easier to apply the patches to a copy of the upstream qemu.git) I uploaded them to github for those who are interested in it. git://github.com/yamahata/qemu.git qemu-postcopy-june-04-2012 git://github.com/yamahata/linux-umem.git linux-umem-june-04-2012 > BTW, I am also curious if you have considered using any kind of RDMA features for > optimizing the page-faults during postcopy ? Yes, RDMA is interesting topic. Can we share your use case/concern/issues? Thus we can collaborate. You may want to see Benoit's results. As long as I know, he has not published his code yet. thanks, > Thanks > Vinod > > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 4 Jun 2012 18:57:02 +0900 > From: Isaku Yamahata<yamahata@xxxxxxxxxxxxx> > To: qemu-devel@xxxxxxxxxx, kvm@xxxxxxxxxxxxxxx > Cc: benoit.hudzia@xxxxxxxxx, aarcange@xxxxxxxxxx, aliguori@xxxxxxxxxx, > quintela@xxxxxxxxxx, stefanha@xxxxxxxxx, t.hirofuchi@xxxxxxxxxx, > dlaor@xxxxxxxxxx, satoshi.itoh@xxxxxxxxxx, mdroth@xxxxxxxxxxxxxxxxxx, > yoshikawa.takuya@xxxxxxxxxxxxx, owasserm@xxxxxxxxxx, avi@xxxxxxxxxx, > pbonzini@xxxxxxxxxx > Subject: [Qemu-devel] [PATCH v2 00/41] postcopy live migration > Message-ID:<cover.1338802190.git.yamahata@xxxxxxxxxxxxx> > > After the long time, we have v2. This is qemu part. > The linux kernel part is sent separatedly. > > Changes v1 -> v2: > - split up patches for review > - buffered file refactored > - many bug fixes > Espcially PV drivers can work with postcopy > - optimization/heuristic > > Patches > 1 - 30: refactoring exsiting code and preparation > 31 - 37: implement postcopy itself (essential part) > 38 - 41: some optimization/heuristic for postcopy > > Intro > ===== > This patch series implements postcopy live migration.[1] > As discussed at KVM forum 2011, dedicated character device is used for > distributed shared memory between migration source and destination. > Now we can discuss/benchmark/compare with precopy. I believe there are > much rooms for improvement. > > [1] http://wiki.qemu.org/Features/PostCopyLiveMigration > > > Usage > ===== > You need load umem character device on the host before starting migration. > Postcopy can be used for tcg and kvm accelarator. The implementation depend > on only linux umem character device. But the driver dependent code is split > into a file. > I tested only host page size == guest page size case, but the implementation > allows host page size != guest page size case. > > The following options are added with this patch series. > - incoming part > command line options > -postcopy [-postcopy-flags<flags>] > where flags is for changing behavior for benchmark/debugging > Currently the following flags are available > 0: default > 1: enable touching page request > > example: > qemu -postcopy -incoming tcp:0:4444 -monitor stdio -machine accel=kvm > > - outging part > options for migrate command > migrate [-p [-n] [-m]] URI [<prefault forward> [<prefault backword>]] > -p: indicate postcopy migration > -n: disable background transferring pages: This is for benchmark/debugging > -m: move background transfer of postcopy mode > <prefault forward>: The number of forward pages which is sent with on-demand > <prefault backward>: The number of backward pages which is sent with > on-demand > > example: > migrate -p -n tcp:<dest ip address>:4444 > migrate -p -n -m tcp:<dest ip address>:4444 32 0 > > > TODO > ==== > - benchmark/evaluation. Especially how async page fault affects the result. > - improve/optimization > At the moment at least what I'm aware of is > - making incoming socket non-blocking with thread > As page compression is comming, it is impractical to non-blocking read > and check if the necessary data is read. > - touching pages in incoming qemu process by fd handler seems suboptimal. > creating dedicated thread? > - outgoing handler seems suboptimal causing latency. > - consider on FUSE/CUSE possibility > - don't fork umemd, but create thread? > > basic postcopy work flow > ======================== > qemu on the destination > | > V > open(/dev/umem) > | > V > UMEM_INIT > | > V > Here we have two file descriptors to > umem device and shmem file > | > | umemd > | daemon on the destination > | > V create pipe to communicate > fork()---------------------------------------, > | | > V | > close(socket) V > close(shmem) mmap(shmem file) > | | > V V > mmap(umem device) for guest RAM close(shmem file) > | | > close(umem device) | > | | > V | > wait for ready from daemon<----pipe-----send ready message > | | > | Here the daemon takes over > send ok------------pipe---------------> the owner of the socket > | to the source > V | > entering post copy stage | > start guest execution | > | | > V V > access guest RAM read() to get faulted pages > | | > V V > page fault ------------------------------>page offset is returned > block | > V > pull page from the source > write the page contents > to the shmem. > | > V > unblock<-----------------------------write() to tell served pages > the fault handler returns the page > page fault is resolved > | > | pages can be sent > | backgroundly > | | > | V > | write() > | | > V V > The specified pages<-----pipe------------request to touch pages > are made present by | > touching guest RAM. | > | | > V V > reply-------------pipe-------------> release the cached page > | madvise(MADV_REMOVE) > | | > V V > > all the pages are pulled from the source > > | | > V V > the vma becomes anonymous<----------------UMEM_MAKE_VMA_ANONYMOUS > (note: I'm not sure if this can be implemented or not) > | | > V V > migration completes exit() > > > > > Isaku Yamahata (41): > arch_init: export sort_ram_list() and ram_save_block() > arch_init: export RAM_SAVE_xxx flags for postcopy > arch_init/ram_save: introduce constant for ram save version = 4 > arch_init: refactor host_from_stream_offset() > arch_init/ram_save_live: factor out RAM_SAVE_FLAG_MEM_SIZE case > arch_init: refactor ram_save_block() > arch_init/ram_save_live: factor out ram_save_limit > arch_init/ram_load: refactor ram_load > arch_init: introduce helper function to find ram block with id string > arch_init: simplify a bit by ram_find_block() > arch_init: factor out counting transferred bytes > arch_init: factor out setting last_block, last_offset > exec.c: factor out qemu_get_ram_ptr() > exec.c: export last_ram_offset() > savevm: export qemu_peek_buffer, qemu_peek_byte, qemu_file_skip > savevm: qemu_pending_size() to return pending buffered size > savevm, buffered_file: introduce method to drain buffer of buffered > file > QEMUFile: add qemu_file_fd() for later use > savevm/QEMUFile: drop qemu_stdio_fd > savevm/QEMUFileSocket: drop duplicated member fd > savevm: rename QEMUFileSocket to QEMUFileFD, socket_close to fd_close > savevm/QEMUFile: introduce qemu_fopen_fd > migration.c: remove redundant line in migrate_init() > migration: export migrate_fd_completed() and migrate_fd_cleanup() > migration: factor out parameters into MigrationParams > buffered_file: factor out buffer management logic > buffered_file: Introduce QEMUFileNonblock for nonblock write > buffered_file: add qemu_file to read/write to buffer in memory > umem.h: import Linux umem.h > update-linux-headers.sh: teach umem.h to update-linux-headers.sh > configure: add CONFIG_POSTCOPY option > savevm: add new section that is used by postcopy > postcopy: introduce -postcopy and -postcopy-flags option > postcopy outgoing: add -p and -n option to migrate command > postcopy: introduce helper functions for postcopy > postcopy: implement incoming part of postcopy live migration > postcopy: implement outgoing part of postcopy live migration > postcopy/outgoing: add forward, backward option to specify the size > of prefault > postcopy/outgoing: implement prefault > migrate: add -m (movebg) option to migrate command > migration/postcopy: add movebg mode > > Makefile.target | 5 + > arch_init.c | 298 ++++--- > arch_init.h | 20 + > block-migration.c | 8 +- > buffered_file.c | 322 ++++++-- > buffered_file.h | 32 + > configure | 12 + > cpu-all.h | 9 + > exec-obsolete.h | 1 + > exec.c | 87 ++- > hmp-commands.hx | 18 +- > hmp.c | 10 +- > linux-headers/linux/umem.h | 42 + > migration-exec.c | 12 +- > migration-fd.c | 25 +- > migration-postcopy-stub.c | 77 ++ > migration-postcopy.c | 1771 +++++++++++++++++++++++++++++++++++++++ > migration-tcp.c | 25 +- > migration-unix.c | 26 +- > migration.c | 97 ++- > migration.h | 47 +- > qapi-schema.json | 4 +- > qemu-common.h | 2 + > qemu-file.h | 8 +- > qemu-options.hx | 25 + > qmp-commands.hx | 4 +- > savevm.c | 177 ++++- > scripts/update-linux-headers.sh | 2 +- > sysemu.h | 4 +- > umem.c | 364 ++++++++ > umem.h | 101 +++ > vl.c | 16 +- > vmstate.h | 2 +- > 33 files changed, 3373 insertions(+), 280 deletions(-) > create mode 100644 linux-headers/linux/umem.h > create mode 100644 migration-postcopy-stub.c > create mode 100644 migration-postcopy.c > create mode 100644 umem.c > create mode 100644 umem.h > > > > > ------------------------------ > > -- yamahata -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html