On 11/21/2011 02:01 PM, Richard Laager wrote: > I'm not an expert on the architecture of KVM, so perhaps this is a QEMU > question. If so, please let me know and I'll ask on a different list. It is a qemu question, yes (though fork()ing a guest also relates to kvm). > Background: > > Assuming the block layer can make instantaneous snapshots of a guest's > disk (e.g. lvcreate -s), one can get "crash consistent" (i.e. as if the > guest crashed) snapshots. To get a "fully consistent" snapshot, you need > to shutdown the guest. For production VMs, this is obviously not ideal. > > Idea: > > What if KVM/QEMU was to fork() the guest and shutdown one copy? > > KVM/QEMU would momentarily halt the execution of the guest and take a > writable, instantaneous snapshot of each block device. Then it would > fork(). The parent would resume execution as normal. The child would > redirect disk writes to the snapshot(s). The RAM should have > copy-on-write behavior as with any other fork()ed process. Other > resources like the network, display, sound, serial, etc. would simply be > disconnected/bit-bucketed. Finally, the child would resume guest > execution and send the guest an ACPI power button press event. This > would cause the guest OS to perform an orderly shutdown. > > I believe this would provide consistent snapshots in the vast majority > of real-world scenarios in a guest OS and application-independent way. Interesting idea. Will the guest actually shut down nicely without a network? Things like NFS mounts will break. > Implementation Nits: > > * A timeout on the child process would likely be a good idea. > * It'd probably be best to disconnect the network (i.e. tell the > guest the cable is unplugged) to avoid long timeouts. Likewise > for the hardware flow-control lines on the serial port. This is actually critical, otherwise the guest will shutdown(2) all sockets and confuse the clients. > * For correctness, fdatasync()ing or similar might be necessary > after halting execution and before creating the snapshots. Microsoft guests have an API to quiesce storage prior to a snapshot, and I think there is work to bring this to Linux guests. So it should be possible to get consistent snapshots even without this, but it takes more integration. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html