Hi Cao, 2016-04-27 3:21 GMT+08:00 Cao, Lei <Lei.Cao@xxxxxxxxxxx>: > This patch series adds memory tracking support for performant > checkpoint/rollback implementations. It can also be used by live > migration to improve predictability. > > Introduction > > Brendan Cully's Remus project white paper is one of the best written on > the subject of fault tolerance using checkpoint/rollback techniques and > is the best place to start for a general background. > (http://www.cs.ubc.ca/~andy/papers/remus-nsdi-final.pdf) > It gives a great outline of the basic requirements and characteristics > of a checkpointed system, including a few of the performance issues. > But Remus did not go far enough in the area of system performance for > commercial production. > > This patch series addresses known bottlenecks and limitations in a > checkpointed system: use of large bitmaps to track dirty memory, and > lack of multi thread support due to mmu_lock being a spin lock. These > modifications and still more modifications to qemu have allowed us > to run checkpoint cycles at rates up to 2500 per second, while still > allowing the VM to get useful work done. > > The patch series also helps to improve the predictability of live > migrations of memory write intensive workloads. The qemu autoconverge > feature helps such workloads by throttling CPUs to slow down memory writes. > However, CPU throttling has unknown effect on guest and it is > ineffective for workloads where memory write speed is not dependent > on CPU execution speed. Checkpointing mode where VM is paused and dirty > memory is harvested periodically will help in that regard. We have > implemented a checkpointing-mode live migration, which we will put on > github in the near future. > > Design Goals > > The patch series does not change or remove any existing KVM functionality. > It represents only additional functions (ioctls) into KVM from user space > and these changes coexist with the current dirty memory logging facilities. > It is possible to run multiple QEMU instances such that some of the QEMUs > perform live migration using the existing memory logging mechanism and > others migrate or run in fault tolerant mode using the new memory tracking > functions. > > Dynamic memory allocation and freeing is avoided during the checkpoint > cycles in order to avoid surprises during performance-critical operations. > The allocations and frees are done only when a VM enters or exits checkpoint > mode. Once checkpoint mode is entered, a VM will typically run in this mode > forever, where forever means until a fault occurs that leads to failover to > the standby host, or the VM is shutdown, or a system administrator no longer > want to run in FT mode. > > Modifications > > All modifications affect only the KVM instance where the primary (active) VM > is running, and these modifications are not in play on the standby (passive) > host, where this is a VM created that matches the primary in its configuration, > but it does not execute until a migration/failover event occurs. I just saw your presentation slides in kvm forum, you mentioned that "Cpu throttling may not be effective for some workloads where memory write speed is not dependent on CPU execution speed", could you point out which kinds of workload where memory write speed is not dependent on CPU execution speed? If the memory is mainly dirtied by DMA or something else in this workload? Regards, Wanpeng Li -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html