On Fri, 2007-03-09 at 03:25 +0000, Daniel P. Berrange wrote: > Thinking about later RPM upgrades I think we need to think about whether it > will be possible to restart the libvirt_qemud while guests & networks are > running. If I had time, I'd give some serious thought as to whether we need to allow this. Are there any other examples of a daemon that manages something long-lived that can't be restarted without shutting down what it's managing? > There's a couple of issues: > > - We do waitpid() to cleanup qemu & dnsmasq processes when we stop domains > & networks, or to detect when they crash. For the former, we could may > they daemons to avoid waitpid() cleanup, but we'd still need waitpid to > be able to detect shutdowns. There is also the issue of enumerating > running instances. > > - We always try to re-create a bridge device at startup, even if it already > exists. Likewise we always try to add the IPtables rules & start dnsmasq. > We can easily detect if the bridge already exists. I think we can probably > double check iptables rulles too., The tricky one is figuring out whether > a dnsmasq instance is still running. > > Dealing with theses not only helps planned restarts, but will also make it > possible start up the daemon again after a crash without having to kill off > all guests & networks manually. So I think it is worth investigating what > we can do to enable restarts. It might be worth waiting until we sort out > whether we'll merge libvirt_qemud with the generic libvirtd remote daemon > though so we don't have to do the work twice over. I guess the way I'd look at it is, a running qemud contains various state - how do you recover that state on restart? e.g. - the list of running VMs, the PID of the qemu processes, the stdout/stderr/monitor pipes, the domain ID, and the domain UUID if we generated it - the list of running networks, the bridge associated with each network and the PID of the dnsmasq processes. I could perhaps imagine using named pipes, caching this state in /var and re-loading it on startup but ... non-trivial to say the least. Cheers, Mark.