* Yang Hongyang (yanghy@xxxxxxxxxxxxxx) wrote: > Virtual machine (VM) replication is a well known technique for > providing application-agnostic software-implemented hardware fault > tolerance "non-stop service". COLO is a high availability solution. > Both primary VM (PVM) and secondary VM (SVM) run in parallel. They > receive the same request from client, and generate response in parallel > too. If the response packets from PVM and SVM are identical, they are > released immediately. Otherwise, a VM checkpoint (on demand) is > conducted. The idea is presented in Xen summit 2012, and 2013, > and academia paper in SOCC 2013. It's also presented in KVM forum > 2013: > http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf > Please refer to above document for detailed information. > Please also refer to previous posted RFC proposal: > http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html Hi Yang, Thanks for this set of patches (and I've replied to many individually). > The patchset is also hosted on github: > https://github.com/macrosheep/qemu/tree/colo_v0.1 > > This patchset is RFC, implements the frame of colo, without > failover and nic/disk replication. But it is ready for demo > the COLO idea above QEMU-Kvm. > Steps using this patchset to get an overview of COLO: > 1. configure the source with --enable-colo option > 2. compile > 3. just like QEMU's normal migration, run 2 QEMU VM: > - Primary VM > - Secondary VM with -incoming tcp:[IP]:[PORT] option > 4. on Primary VM's QEMU monitor, run following command: > migrate_set_capability colo on > migrate tcp:[IP]:[PORT] > 5. done > you will see two runing VMs, whenever you make changes to PVM, SVM > will be synced to PVM's state. > > TODO list: > 1. failover > 2. nic replication > 3. disk replication[COLO Disk manager] I wonder if there are any parts that can be borrowed from other code to get it going; I notice that the reverse execution patchset has a network packet record/replay mode: https://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00157.html What was used for the nic comparison in the 2013 kvm forum paper? Dave > > Any comments/feedbacks are warmly welcomed. > > Thanks, > Yang > > Yang Hongyang (17): > configure: add CONFIG_COLO to switch COLO support > COLO: introduce an api colo_supported() to indicate COLO support > COLO migration: add a migration capability 'colo' > COLO info: use colo info to tell migration target colo is enabled > COLO save: integrate COLO checkpointed save into qemu migration > COLO restore: integrate COLO checkpointed restore into qemu restore > COLO buffer: implement colo buffer as well as QEMUFileOps based on it > COLO: disable qdev hotplug > COLO ctl: implement API's that communicate with colo agent > COLO ctl: introduce is_slave() and is_master() > COLO ctl: implement colo checkpoint protocol > COLO ctl: add a RunState RUN_STATE_COLO > COLO ctl: implement colo save > COLO ctl: implement colo restore > COLO save: reuse migration bitmap under colo checkpoint > COLO ram cache: implement colo ram cache on slaver > HACK: trigger checkpoint every 500ms > > Makefile.objs | 2 + > arch_init.c | 174 +++++++++- > configure | 14 + > include/exec/cpu-all.h | 1 + > include/migration/migration-colo.h | 36 +++ > include/migration/migration.h | 13 + > include/qapi/qmp/qerror.h | 3 + > migration-colo-comm.c | 78 +++++ > migration-colo.c | 643 +++++++++++++++++++++++++++++++++++++ > migration.c | 45 ++- > qapi-schema.json | 9 +- > stubs/Makefile.objs | 1 + > stubs/migration-colo.c | 34 ++ > vl.c | 12 + > 14 files changed, 1044 insertions(+), 21 deletions(-) > create mode 100644 include/migration/migration-colo.h > create mode 100644 migration-colo-comm.c > create mode 100644 migration-colo.c > create mode 100644 stubs/migration-colo.c > > -- > 1.9.1 > -- Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html