On 07/16/2014 03:56 PM, Eric Blake wrote: > On 07/16/2014 08:30 AM, Eric Blake wrote: >> On 07/16/2014 08:12 AM, Jason J. Herne wrote: >>> From: "Jason J. Herne" <jjherne@xxxxxxxxxx> >>> >>> libvirt_iohelper is a helper process that is exec'ed and used to handle I/O >>> during a Qemu managed save operation. Due to a missing call to >>> virFileWrapperFdClose, all I/O error messages reported by iohelper are lost. >>> >>> +++ b/src/qemu/qemu_driver.c >>> @@ -3015,6 +3015,7 @@ qemuDomainSaveMemory(virQEMUDriverPtr driver, >>> >>> cleanup: >>> VIR_FORCE_CLOSE(fd); >>> + virFileWrapperFdClose(wrapperFd); >>> virFileWrapperFdFree(wrapperFd); >>> VIR_FREE(xml); > > But qemuDomainSaveMemory() already has a call to virFileWrapperFdClose() > earlier on; worse, the current implementation of virFileWrapperFdClose() > is not designed to be called twice if an error occurred (rather, if > there is an error, two calls will end up logging the error twice, when > once would have been sufficient). How are you getting to a point where > the cleanup label is reached without going through the earlier close? > > Is there some easy setup you used in testing this patch, so that I can > reproduce a scenario where iohelper will reliably fail? Okay, looking into this a bit more: I made the following change: diff --git i/src/util/iohelper.c w/src/util/iohelper.c index 8a3c377..dfb45e1 100644 --- i/src/util/iohelper.c +++ w/src/util/iohelper.c @@ -307,6 +307,7 @@ main(int argc, char **argv) if (delete) unlink(path); + fprintf(stderr, _("goodbye world\n")); goto error; return 0; error: then tried to do a 'virsh save testvm1 /tmp/save'. Even without your patch, I get a very nice error: error: Failed to save domain testvm1 to /tmp/save error: internal error: Child process (LIBVIRT_LOG_OUTPUTS=1:stderr /home/eblake/libvirt/src/libvirt_iohelper /tmp/save 0 1) unexpected exit status 1: goodbye world /home/eblake/libvirt/src/libvirt_iohelper: unknown failure with /tmp/save But if I then rework the iohelper patch: diff --git i/src/util/iohelper.c w/src/util/iohelper.c index 8a3c377..efb1366 100644 --- i/src/util/iohelper.c +++ w/src/util/iohelper.c @@ -301,6 +301,7 @@ main(int argc, char **argv) exit(EXIT_FAILURE); } + fprintf(stderr, _("goodbye world\n")); goto error; if (fd < 0 || runIO(path, fd, oflags, length) < 0) goto error; the error is now: error: Failed to save domain testvm1 to /tmp/save error: operation failed: domain save job: unexpectedly failed So the problem is that we have _two_ possible sources of errors (qemu reporting failure because it can't write to iohelper, and iohelper reporting an error from whatever other reason, such as disk full). If qemu finishes, we have only the iohelper message and properly report it; but if we have both failures, then the qemu error takes priority, and in this case it is lower priority. There are also cases where qemu will error out but iohelper succeeds (such as if qemu refuses to migrate because the guest has hostdev passthrough). So I _think_ what we want to do is always check BOTH places for error; if only one of the two fails, use that message. If both fail, then I don't know whether it is possible to have a heuristic for which failure message is more meaningful, or whether it is better to report both errors (even though it will often be the case that one error was a knock-on effect from the other). But I'm a bit stuck on the best way to implement that. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list