In func doPeer2PeerMigrate3(), in the "finish" step, it checks whether domainMigrateFinish3() returns NULL or not. if it(ddomain) is NULL, it just restarts the guest on the source. Please consider the scenario that the ddomain has already been running on the dest, but it fails to tell the source this fact, and ddomain becomes NULL. If we then restart the guest on the source, there will be 2 same guests running on both sides, and a SPLIT-BRAIN occurs. It seems much better to stop them both , rather than leaving them both running. At least, when we found the ddomain is NULL, we should probably check whether the problem is caused by keepAlive failure, if so, kill the guest on the source rather than restarting it. How do you think about that? BTW, it says that: "The lock manager plugins should take care of safety in this scenario" in the comment, with the commit 2593f9692df0f128b14cde811e18aa49c1cf3e06, I don't quite understand that: 1) If we migrate the guest with the flag VIR_MIGRATE_NON_SHARED_DISK, then nbd server may take care of the data consistency, But before it starts the cpus on the dest, the nbd server is already stopped. So, at this moment, no one takes care of this problem. 2) If we migrate the guest with a shared disk, then does it mean that the nfs or other shareing-disk schemas should prevent split-brain by themselves? -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list