On 23.10.2012 16:51, Eric Blake wrote: > On 10/23/2012 03:39 AM, Michal Privoznik wrote: >> Currently, we use iohelper when saving/restoring a domain. >> However, if there's some kind of error (like I/O) it is not >> propagated to libvirt. Since it is not qemu who is doing >> the actual write() it will not get error. The iohelper does. >> Therefore we should check for iohelper errors as it makes >> libvirt more user friendly. >> --- >> src/libvirt_private.syms | 1 + >> src/qemu/qemu_driver.c | 7 ++++- >> src/util/virfile.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++ >> src/util/virfile.h | 2 + >> 4 files changed, 55 insertions(+), 2 deletions(-) > > I agree that catching iohelper errors is needed, but I'm worried that > you might introduce deadlock with this approach (then again, it may just > be theoretical): > >> +++ b/src/qemu/qemu_driver.c >> @@ -2945,6 +2945,7 @@ endjob: >> if (rc < 0) >> VIR_WARN("Unable to resume guest CPUs after save failure"); >> } >> + virFileWrapperFdCatchError(wrapperFd); >> } >> if (qemuDomainObjEndAsyncJob(driver, vm) == 0) >> vm = NULL; >> @@ -3282,9 +3283,11 @@ doCoreDump(struct qemud_driver *driver, >> >> cleanup: >> VIR_FORCE_CLOSE(fd); >> - virFileWrapperFdFree(wrapperFd); >> - if (ret != 0) >> + if (ret != 0) { >> + virFileWrapperFdCatchError(wrapperFd); >> unlink(path); > > You aren't doing any reading of the stderr file descriptor until you get > to the cleanup label. But suppose that iohelper encounters situations > that cause it to write more than PIPE_BUF bytes to stderr, but while > still continuing to run. Then, because no one is reading the other end > of the pipe, iohelper blocks on writes to stderr instead of proceeding > on with the rest of its operations, and thus never gets to a point where > it will exit in order to kick libvirtd over to the cleanup label to > start reading the stderr pipe in the first place. > >> +++ b/src/util/virfile.c >> @@ -135,6 +135,7 @@ virFileDirectFdFlag(void) >> * read-write is not supported, just a single direction. */ >> struct _virFileWrapperFd { >> virCommandPtr cmd; /* Child iohelper process to do the I/O. */ >> + int err_fd; /* FD to read stderr of @cmd */ >> }; >> >> #ifndef WIN32 >> @@ -229,6 +230,14 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags) >> virCommandAddArg(ret->cmd, "0"); >> } >> >> + /* In order to catch iohelper stderr, we must: >> + * - pass a FD to virCommand (-1 to auto-allocate one) >> + * - change iohelper's env so virLog functions print to stderr >> + */ >> + ret->err_fd = -1; >> + virCommandSetErrorFD(ret->cmd, &ret->err_fd); >> + virCommandAddEnvPair(ret->cmd, "LIBVIRT_LOG_OUTPUTS", "1:stderr"); > > Would it work to use virCommandSetErrorBuffer() instead, or does that > rely on using virCommandRun() to properly populate the buffer via > poll(), where we are stuck using virCommandRunAsync() instead? Or, > since we are already using virCommandRunAsync(), does that mean we need > to figure out how to incorporate a poll() to favorably handle stderr in > addition to everything else that iohelper is already handling? > Yeah, I've tried te virCommandSetErrorBuffer() way firstly. But it requires the poll() which is wrapped in virCommandProcessIO(). However, it is static so it would needed to be dig out and run in a separate thread. But we can run virCommandRun directly then. So I think I can incorporate our event loop. Just add a callback once ret->err_fd is initialized. > Or am I worrying too much, and if iohelper hits any issue where it would > be logging to stderr in the first place, it will be exiting before > filling up the stderr pipe and thus never deadlocking? Well, with current implementation of iohelper there is only one place where it doesn't die right after reporting an error: src/util/iohelper.c:160 I don't know how possible is to hit that. > > At any rate, depending on how that question is answered, this code does > indeed look useful. > Okay, since we would have to keep this limitation in mind when changing iohelper (and honestly, I'll forget immediately after the push) I think we should make this safer and read the error pipe in the event loop. I'll update the patch and resend. Michal -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list