On Mon, Apr 04, 2011 at 01:31:40PM -0600, Eric Blake wrote: > On 04/04/2011 06:55 AM, Daniel P. Berrange wrote: > > Allow the parent process to perform a bi-directional handshake > > with the child process during fork/exec. The child process > > will fork and do its initial setup. Immediately prior to the > > exec(), it will stop & wait for a handshake from the parent > > process. The parent process will spawn the child and wait > > until the child reaches the handshake point. It will do > > whatever extra setup work is required, before signalling the > > child to continue. > > > > The implementation of this is done using two pairs of blocking > > pipes. The first pair is used to block the parent, until the > > child writes a single byte. Then the second pair pair is used > > to block the child, until the parent confirms with another > > single byte. > > How worried are we about the child not doing any async-unsafe actions if > it wishes to avoid deadlock? We've already previously identified this > as a bug (such as in our handling of malloc in the child), but it's > somewhat of a corner-case, and I'm not sure how invasive it will be to > fix properly; so I am okay if a fixed version of this patch goes in > while we still leave that larger issue open for thought. IMHO, this patch doesn't make the existing problem any worse. eg, if VIR_DEBUG was going to cause deadlock, it would have happened before we get as far this new code I'm adding. > > +++ b/src/util/command.c > > @@ -35,6 +35,11 @@ > > #include "files.h" > > #include "buf.h" > > > > +#include <stdlib.h> > > +#include <stdbool.h> > > "internal.h" guaranteed this one for us. Opps, yes. > > +#include <poll.h> > > +#include <sys/wait.h> > > Why do we need this one? We're already using WIFEXITED and friends > without explicitly needing this header. I'll double check whether we need this. > > > @@ -1115,7 +1129,6 @@ virCommandRun(virCommandPtr cmd, int *exitstatus) > > return ret; > > } > > > > - > > /* > > * Perform all virCommand-specific actions, along with the user hook. > > */ > > Spurious whitespace change? > > > @@ -1125,12 +1138,61 @@ virCommandHook(void *data) > > virCommandPtr cmd = data; > > int res = 0; > > > > - if (cmd->hook) > > + if (cmd->hook) { > > + VIR_DEBUG("Run hook %p %p", cmd->hook, cmd->opaque); > > res = cmd->hook(cmd->opaque); > > + VIR_DEBUG("Done hook %d", res); > > This adds yet more calls to malloc() inside the child > > > + } > > if (res == 0 && cmd->pwd) { > > VIR_DEBUG("Running child in %s", cmd->pwd); > > ...but no worse than what we've already been doing. If VIR_DEBUG were > the only use of malloc() in the child, then it could just boil down to > conditionally compiling VIR_DEBUG statements where they can be turned on > for debugging the implementation, but compiled out later to avoid > deadlock (at least, I'm assuming that VIR_DEBUG uses malloc() to format > a timestamp prior to the message when posting to the circular buffer). In addition, VIR_DEBUG should be a no-op in normal operation. Only if someone raises the debug level significantly will it do work that triggers malloc & thus exposes the risk. > > res = chdir(cmd->pwd); > > + if (res < 0) { > > + virReportSystemError(errno, > > + _("Unable to change to %s"), cmd->pwd); > > + } > > + } > > + if (cmd->handshake) { > > + char c = res < 0 ? '0' : '1'; > > + int rv; > > + VIR_DEBUG("Notifying parent for handshake start on %d", cmd->handshakeWait[1]); > > + if (safewrite(cmd->handshakeWait[1], &c, sizeof(c)) != sizeof(c)) { > > Since c is char, sizeof(c) == 1 by definition. But I'm okay with you > spelling out the longer form. > > > + virReportSystemError(errno, "%s", _("Unable to notify parent process")); > > + return -1; > > + } > > + > > + /* On failure we pass the error message back to parent, > > + * so they don't have to dig through stderr logs > > + */ > > + if (res < 0) { > > + virErrorPtr err = virGetLastError(); > > + const char *msg = err ? err->message : > > + _("Unknown failure during hook execution"); > > Hmm, now that's a lot more use of malloc(), and not just in VIR_DEBUG() > calls. Code earlier that actually use virRaiseError would have already triggered malloc, so I still think this isn't making things any worse. > > + > > +void virCommandRequireHandshake(virCommandPtr cmd) > > +{ > > + if (pipe(cmd->handshakeWait) < 0) { > > NULL dereference if cmd had a previous failure. You need to guard with: > > if (!cmd || cmd->has_error) return; Opps, yeah. > > > + > > +int virCommandHandshakeWait(virCommandPtr cmd) > > +{ > > + char c; > > + int rv; > > + VIR_DEBUG("Wait for handshake on %d", cmd->handshakeWait[0]); > > Likewise. > > > + > > +int virCommandHandshakeNotify(virCommandPtr cmd) > > +{ > > + char c = '1'; > > + VIR_DEBUG("Notify handshake on %d", cmd->handshakeWait[0]); > > Likewise. > > > +++ b/src/util/command.h > > @@ -274,6 +274,11 @@ int virCommandRunAsync(virCommandPtr cmd, > > int virCommandWait(virCommandPtr cmd, > > int *exitstatus) ATTRIBUTE_RETURN_CHECK; > > > > +void virCommandRequireHandshake(virCommandPtr cmd); > > + > > +int virCommandHandshakeWait(virCommandPtr cmd); > > +int virCommandHandshakeNotify(virCommandPtr cmd); > > These last two could use ATTRIBUTE_RETURN_CHECK. All three could use > documentation. Ok. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list