On Wed, 08/21 22:49, g.danti@xxxxxxxxxx wrote: > On 2013-08-21 21:40, Brian Jackson wrote: > >On Wednesday, August 21, 2013 6:02:31 AM CDT, g.danti@xxxxxxxxxx > >wrote: > >>Hi all, > >>I have a question about Linux KVM HA cluster. > >> > >>I understand that in a HA setup I can live migrate virtual > >>machine between host that shares the same storage (via various > >>methods, eg: DRDB). This enable us to migrate the VMs based on > >>hosts loads and performance. > >> > >>ìMy current understanding is that, with this setup, an host > >>crash will cause the VMs to be restarded on another host. > >> > >>However, I wonder if there is a method to have a fully > >>fault-tolerant HA configuration, where for "fully > >>fault-tolerant" I means that an host crash (eg: power failures) > >>will cause the VMs to be migrated to another hosts with no state > >>change. In other word: it is possible to have an > >>always-synchronized (both disk & memory) VM instance on another > >>host, so that the migrated VM does not need to be restarted but > >>only restored/unpaused? For disk data synchronization we can use > >>shared storages (bypassing the problem) or something similar do > >>DRDB, but what about memory? > > > > > >You're looking for something that doesn't exist for KVM. There was a > >project once for it called Kemari, but afaik, it's been abandoned for > >a while. > > Hi Brian, > thank you for your reply. > > As I googled extensively without finding anything, I was prepared to > a similar response. > > Anyway, from what I understand, Qemu already use a similar approach > (tracking dirty memory pages) when live migrating virtual machines > to another host. > Active/active sounds not easy to get, as it seem to me, since you'll need to make sure the VMs on both nodes are always in the same state all the time, that sounds impossible for two emulator processes on two different hosts. I think hot spare is more practical: in background you repetitively trigger migration of delta memory and copy to hot spare, but don't start to run it. Once the active one fails, you can resume the running of hot spare, which is at a latest checkpoint. But I think this needs to some work on current live migration code. Fam -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html