On 08/09/2012 09:20 AM, Daniel P. Berrange wrote: > From: "Daniel P. Berrange" <berrange@xxxxxxxxxx> > > The virtlockd daemon maintains file locks on behalf of libvirtd > and any VMs it is running. These file locks must be held for as > long as any VM is running. If virtlockd itself ever quits, then > it is expected that a node would be fenced/rebooted. Thus to > allow for software upgrads on live systemd, virtlockd needs the s/upgrads/upgrades/ s/systemd/systems/ ? > ability to re-exec() itself. > > Upon receipt of SIGUSR1, virtlockd will save its current live > state out to a file /var/run/virtlockd-restart-exec.json > It then re-exec()'s itself with exactly the same argv as it > originally had, and loads the state file, reconstructing any > objects as appropriate. > > The state file contains information about all locks held and > all network services and clients currently active. An example > state document is > > > Signed-off-by: Daniel P. Berrange <berrange@xxxxxxxxxx> > --- > libvirt.spec.in | 5 +- > src/locking/lock_daemon.c | 417 ++++++++++++++++++++++++++++++++++++++++++++-- > 2 files changed, 408 insertions(+), 14 deletions(-) > > diff --git a/libvirt.spec.in b/libvirt.spec.in > index 5a5d724..4dde964 100644 > --- a/libvirt.spec.in > +++ b/libvirt.spec.in > @@ -1436,7 +1436,10 @@ fi > /bin/systemctl daemon-reload >/dev/null 2>&1 || : > if [ $1 -ge 1 ] ; then > # Package upgrade, not uninstall > - /bin/systemctl try-restart virtlockd.service >/dev/null 2>&1 || : > + /bin/systemctl status virtlockd.service >/dev/null 2>&1 > + if [ $? = 1 ] ; then That says 'if virtlockd.service failed to report status, then send it SIGUSR1'. Don't you really want to check '$? = 0'? > +++ b/src/locking/lock_daemon.c > > +static virLockDaemonPtr > +virLockDaemonNewPostExecRestart(virJSONValuePtr object) > +{ > + virLockDaemonPtr lockd; > + virJSONValuePtr child; > + virJSONValuePtr lockspaces; > + size_t i; > + int n; > + > + if (VIR_ALLOC(lockd) < 0) { > + virReportOOMError(); > + return NULL; > + } > + > + if (virMutexInit(&lockd->lock) < 0) { > + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", > + _("Unable to initialize mutex")); > + VIR_FREE(lockd); > + return NULL; > + } > + > + if (!(lockd->lockspaces = virHashCreate(3, > + virLockDaemonLockSpaceDataFree))) Rather than re-creating with the original default of 3, should we recreate with the number of elements read from JSON? (Of course, that means you can't call virHashCreate until later in the function, which may be too complex to wait for) > @@ -464,6 +569,8 @@ virLockDaemonSetupNetworkingSystemD(virNetServerPtr srv) > unsigned long long procid; > unsigned int nfds; > > + VIR_DEBUG("Setting up networking from systemd"); > + This hunk belongs in the previous patch. > + if (virJSONValueObjectGetNumberUint(object, "ownerPid", &ownerPid) < 0) { > + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", > + _("Missing ownerPid data in JSON document")); > + goto error; > + } > + priv->ownerPid = (pid_t)ownerPid; This cast to (pid_t) might result in truncation. Should we be doing further sanity checking, such as ensuring the cast doesn't truncate and that the resulting pid is non-negative? > +static int > +virLockDaemonPreExecRestart(virNetServerPtr srv, > + char **argv) > +{ > + virJSONValuePtr child; > + char *state = NULL; > + int ret = -1; > + virJSONValuePtr object; > + char *magic; > + virHashKeyValuePairPtr pairs = NULL, tmp; > + virJSONValuePtr lockspaces; > + > + VIR_DEBUG("Running pre-restart exec"); Oh - now I see something that I wasn't catching when I complained about the earlier patches dealing with a fork/CLOEXEC/exec timing - you _aren't_ forking, but directly calling exec to overlay the current process with the new binary! Furthermore, although it looks like virtlockd calls fork to daemonize at startup, it never spawns any child processes - so we _don't_ have to worry about any O_CLOEXEC races on creation, or restoring CLOEXEC on restart, precisely because we never spawn off a child in another thread where the leak would be problematic. Still, I would feel a lot better if we document this clearly in the code (maybe 'this function is only safe to call from virtlockd' for all PreExec functions that clear CLOEXEC). > + VIR_DEBUG("Saving state %s", state); > + > + if (virFileWriteStr(VIR_LOCK_DAEMON_RESTART_EXEC_FILE, > + state, 0700) < 0) { Shouldn't 0600 be sufficient? We don't need to execute the .json file. -- Eric Blake eblake@xxxxxxxxxx +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list