On Thu, Oct 09, 2014 at 10:14:48AM +0100, Daniel P. Berrange wrote:
On Thu, Oct 09, 2014 at 09:58:30AM +0200, Martin Kletzander wrote:When daemon is killed right in the middle of probing a qemu binary for its capabilities, the VM is left running. Next time the daemon is starting, it cannot start qemu process because the one that's already running does have the pidfile flock()'d.I was wondering if there's anything we can easily change in the way we launch the QEMU binary so that it automatically dies when libvirtd exits, rather than us needing to manually kill it. The comments say we have to use daemonize to synchronize with the monitor socket creation and I recall we've tried other approaches to that before which failed. Another idea would be to play with adding '-serial stdio' and then when libvirt died stdio would get a broken pipe but I don't think it is safe to use -serial when we have -M none so that's out. So I guss we don't have much choice but to manually kill.Reported-by: Wang Yufei <james.wangyufei@xxxxxxxxxx> Signed-off-by: Martin Kletzander <mkletzan@xxxxxxxxxx> --- src/qemu/qemu_capabilities.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 6fcb5c7..919780e 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -3243,6 +3243,47 @@ virQEMUCapsInitQMP(virQEMUCapsPtr qemuCaps, config.data.nix.path = monpath; config.data.nix.listen = false; + /* + * Check if there wasn't some older qemu left running, e.g. if + * someone killed libvirtd during the probing phase. + */ + if (virFileExists(pidfile) && + virPidFileReadPath(pidfile, &pid) == 0) { + char *cmdpath = NULL; + char *cmdline = NULL; + int len, pos = 0; + + VIR_DEBUG("Checking if there's an older qemu process left running that " + "was used for capability probing"); + + if (virAsprintf(&cmdpath, "/proc/%u/cmdline", pid) < 0) + goto cleanup; + + len = virFileReadAll(cmdpath, 1024, &cmdline); + VIR_FREE(cmdpath); + + /* + * This cycle gets trivially skipped if there was an error in + * virFileReadAll(). + */ + while (len > pos) { + /* + * Find the '-pidfile /path/to/capabilities.pidfile' to be + * sure we won't kill anyone else. + */ + if (STREQ_NULLABLE(cmdline + pos, "-pidfile") && + (pos += strlen(cmdline + pos) + 1) < len && + STREQ_NULLABLE(cmdline + pos, pidfile)) {Heh fun hack. Did you consider simply trying to connect to the monitor socket to see if the QEMU was still there ? That would be slightly more portable as it wouldn't rely on Linux /proc
I wanted to cooperate with qemu somehow, for example killing it if we are handling a signal (that wouldn't help if the daemon is SIGKILL'd), I thought we can add some timeout option to qemu (or it's parent process that would kill it if there was no communication for a while), but that doesn't make much sense and so on. Michal then told me that we can just kill it if it still exists. And that's something that would still need to be there if the monitor is not responding, for example. Since all this machinery happens only after libvirtd has its pidfile flock()'d, we can be sure we're not killing anyone else's qemu. What would be the platforms this wouldn't run on? I can only think of BSD where /proc doesn't have to be mounted. Can we somehow require that? Martin
+ virProcessKillPainfully(pid, true); + break; + } + + pos += strlen(cmdline + pos) + 1; + } + + VIR_FREE(cmdline); + } + VIR_DEBUG("Try to get caps via QMP qemuCaps=%p", qemuCaps);Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Attachment:
signature.asc
Description: Digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list