On Tue, May 05, 2009 at 11:38:13PM -0500, Matthew Farrellee wrote: > Daniel P. Berrange wrote: > > On Tue, May 05, 2009 at 04:13:38PM -0400, Hugh O. Brock wrote: > >> Not too long ago we took a patch that allowed QEMU VMs to keep running > >> even if libvirtd died or was restarted. > >> > >> I was talking to Matt Farrellee (cc'd) this afternoon about > >> manageability, and he feels fairly strongly that this behavior should be > >> optional -- in other words, it should be possible to guarantee that if > >> libvirtd dies, it will take all the VMs with the "die-with-libvirtd" > >> flag set down with it. > >> > >> I'm not sure this API is portable to Xen, but it would work on any > >> hypervisor that represents the VM as a normal process. > >> > >> Does this strike anyone else as useful behavior? > > > > This isn't really a model we want in the architecture. That the QEMU > > instances used to die when libvirtd died was an unfortunate artifact > > of the fact that QEMU was the parent process leader. These days all VMs > > are fully daemonized, so there is no parent/child relationship. In fact > > QEMU was really the odd-ball in this respect, because with Xen/OpenVZ/LXC > > and VirtualBox, VMs have always happily continued when libvirtd stopped > > or died, as do storage pools and virtual networks. > > > > This is important because it ensures we can automatically restart the > > libvirtd daemon during RPM upgrades, and provides robustness should a > > bug cause the daemon to crash - the daemon can be trivially restarted > > and continue with no interruption to services being managed. > > > > It doesn't appear to be the case that the libvirtd daemon can trivially > restart and continue with no interruptions. Right now it loses track of VMs. That a is a bug then, if you can reproduce it, please file a BZ ticket so we can track it down & fix it. > In a scenario where VMs are not deployed and locked to specific physical > nodes, it can be highly valuable to have ways to ensure a VM is no > longer running when a layer of its management stops functioning. IMHO this is a problem to be solved by clustering software. If the clustering software detects a failure with the management service, then it should power fence the entire node. Relying on management service failure to kill the VMs will never be reliable enough. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| -- Libvir-list mailing list Libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list