On Thu, Mar 12, 2015 at 14:23:48 -0600, Eric Blake wrote: > https://bugzilla.redhat.com/show_bug.cgi?id=1199182 documents that > after a series of disk snapshots into existing destination images, > followed by active commits of the top image, it is possible for > qemu 2.2 and earlier to end up tracking a different name for the > image than what it would have had when opening the chain afresh. > That is, when starting with the chain 'a <- b <- c', the name > associated with 'b' is how it was spelled in the metadata of 'c', > but when starting with 'a', taking two snapshots into 'a <- b <- c', > then committing 'c' back into 'b', the name associated with 'b' is > now the name used when taking the first snapshot. > > Sadly, older qemu doesn't know how to treat different spellings of > the same filename as identical files (it uses strcmp() instead of > checking for the same inode), which means libvirt's attempt to > commit an image using solely the names learned from qcow2 metadata > fails with a cryptic: > > error: internal error: unable to execute QEMU command 'block-commit': Top image file /tmp/images/c/../b/b not found > > even though the file exists. Trying to teach libvirt the rules on > which name qemu will expect is not worth the effort (besides, we'd > have to remember it across libvirtd restarts, and track whether a > file was opened via metadata or via snapshot creation for a given > qemu process); it is easier to just always directly ask qemu what > string it expects to see in the first place. > > As a safety valve, we validate that any name returned by qemu > still maps to the same local file as we have tracked it, so that > a compromised qemu cannot accidentally cause us to act on an > incorrect file. It would still allow to act on remote storage though. Also if qemu is corrupted in a way that it'd lie to us correctly via monitor it would be most probably also able to act on the file itself. As the labelling is done from the internal structures it should not allow to do anything besides what the instance is already allowed. A bigger problem though would be that since we don't store the backing chain internally all the time, qemu could rewrite the metadata in the image and libvirt would happily accept those. Corrupting qemu in that way is very unprobable though IMO. > > * src/qemu/qemu_monitor.h (qemuMonitorDiskNameLookup): New > prototype. > * src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskNameLookup): > Likewise. > * src/qemu/qemu_monitor.c (qemuMonitorDiskNameLookup): New function. > * src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskNameLookup) > (qemuMonitorJSONDiskNameLookupOne): Likewise. > * src/qemu/qemu_driver.c (qemuDomainBlockCommit) > (qemuDomainBlockJobImpl): Use it. > > Signed-off-by: Eric Blake <eblake@xxxxxxxxxx> > --- > > v2: as suggested by Dan, add a sanity checking valve to ensure we > don't use qemu's string until vetting that it resolves to the same > local name we are already tracking > > src/qemu/qemu_driver.c | 28 ++++++------- > src/qemu/qemu_monitor.c | 20 ++++++++- > src/qemu/qemu_monitor.h | 8 +++- > src/qemu/qemu_monitor_json.c | 97 +++++++++++++++++++++++++++++++++++++++++++- > src/qemu/qemu_monitor_json.h | 9 +++- > 5 files changed, 144 insertions(+), 18 deletions(-) > > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c > index b3263ac..f0e530d 100644 > --- a/src/qemu/qemu_driver.c > +++ b/src/qemu/qemu_driver.c ... > @@ -16172,8 +16169,12 @@ qemuDomainBlockJobImpl(virDomainObjPtr vm, > } > > qemuDomainObjEnterMonitor(driver, vm); > - ret = qemuMonitorBlockJob(priv->mon, device, basePath, backingPath, > - speed, mode, async); > + if (baseSource) > + basePath = qemuMonitorDiskNameLookup(priv->mon, device, disk->src, I remember that at some point accessing of domain definition while in the monitor was not okay for some reason, but I can't now remember why nor whether it was fixed. > + baseSource); > + if (!baseSource || basePath) > + ret = qemuMonitorBlockJob(priv->mon, device, basePath, backingPath, > + speed, mode, async); > if (qemuDomainObjExitMonitor(driver, vm) < 0) > ret = -1; > if (ret < 0) { ... > diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c > index d869a72..cf7dc5e 100644 > --- a/src/qemu/qemu_monitor.c > +++ b/src/qemu/qemu_monitor.c > @@ -1,7 +1,7 @@ > /* > * qemu_monitor.c: interaction with QEMU monitor console > * > - * Copyright (C) 2006-2014 Red Hat, Inc. > + * Copyright (C) 2006-2015 Red Hat, Inc. Shouldn't we employ something as in gnulib, where copyrights would be bumped at once everywhere? > * Copyright (C) 2006 Daniel P. Berrange > * > * This library is free software; you can redistribute it and/or ... > diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h > index b30da34..e67d800 100644 > --- a/src/qemu/qemu_monitor.h > +++ b/src/qemu/qemu_monitor.h > @@ -1,7 +1,7 @@ > /* > * qemu_monitor.h: interaction with QEMU monitor console > * > - * Copyright (C) 2006-2014 Red Hat, Inc. > + * Copyright (C) 2006-2015 Red Hat, Inc. > * Copyright (C) 2006 Daniel P. Berrange Because it's annoying when you are trying to trim uninterresting parts from a patch when you are doing a review. > * > * This library is free software; you can redistribute it and/or ... > diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c > index c16f3ca..522fd17 100644 > --- a/src/qemu/qemu_monitor_json.c > +++ b/src/qemu/qemu_monitor_json.c > @@ -1,7 +1,7 @@ > /* > * qemu_monitor_json.c: interaction with QEMU monitor console > * > - * Copyright (C) 2006-2014 Red Hat, Inc. > + * Copyright (C) 2006-2015 Red Hat, Inc. > * Copyright (C) 2006 Daniel P. Berrange > * > * This library is free software; you can redistribute it and/or ... > @@ -3883,6 +3883,101 @@ qemuMonitorJSONDrivePivot(qemuMonitorPtr mon, const char *device, > } > > > +static char * > +qemuMonitorJSONDiskNameLookupOne(virJSONValuePtr image, > + virStorageSourcePtr top, > + virStorageSourcePtr target) > +{ > + virJSONValuePtr backing; > + char *ret; > + > + if (!top) > + return NULL; In case the backing chain as remembered by libvirt is shorter than what qemu sees you don't report error. Since the caller checks whether an error was set and if not then adds one, please state this fact in a comment here as it's not obvious until you follow the call chain. > + if (top != target) { > + backing = virJSONValueObjectGet(image, "backing-image"); > + return qemuMonitorJSONDiskNameLookupOne(backing, top->backingStore, > + target); Also the recursion doesn't take into account that for some reason qemu might report a shorter chain than libvirt thinks, which would crash here. > + } > + if (VIR_STRDUP(ret, virJSONValueObjectGetString(image, "filename")) < 0) > + return NULL; > + /* Sanity check - the name qemu gave us should resolve to the same > + file tracked by our target description. */ > + if (virStorageSourceIsLocalStorage(target) && > + STRNEQ(ret, target->path) && > + !virFileLinkPointsTo(ret, target->path)) { > + virReportError(VIR_ERR_INTERNAL_ERROR, > + _("qemu block name '%s' doesn't match expected '%s'"), > + ret, target->path); > + VIR_FREE(ret); > + } > + return ret; > +} > + > + > +char * > +qemuMonitorJSONDiskNameLookup(qemuMonitorPtr mon, > + const char *device, > + virStorageSourcePtr top, > + virStorageSourcePtr target) > +{ > + char *ret = NULL; > + virJSONValuePtr cmd = NULL; > + virJSONValuePtr reply = NULL; > + virJSONValuePtr devices; > + size_t i; > + > + cmd = qemuMonitorJSONMakeCommand("query-block", NULL); > + if (!cmd) > + return NULL; > + if (qemuMonitorJSONCommand(mon, cmd, &reply) < 0) > + goto cleanup; > + > + devices = virJSONValueObjectGet(reply, "return"); > + if (!devices || devices->type != VIR_JSON_TYPE_ARRAY) { > + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", > + _("block info reply was missing device list")); > + goto cleanup; > + } > + > + for (i = 0; i < virJSONValueArraySize(devices); i++) { > + virJSONValuePtr dev = virJSONValueArrayGet(devices, i); > + virJSONValuePtr inserted; > + virJSONValuePtr image; > + const char *thisdev; > + > + if (!dev || dev->type != VIR_JSON_TYPE_OBJECT) { [1] > + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", > + _("block info device entry was not in expected format")); > + goto cleanup; > + } > + > + if ((thisdev = virJSONValueObjectGetString(dev, "device")) == NULL) { You are mixing styles of cheching of the pointer to be non-null within a few lines ([1]) > + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", > + _("block info device entry was not in expected format")); > + goto cleanup; > + } > + > + if (STREQ(thisdev, device)) { > + if ((inserted = virJSONValueObjectGet(dev, "inserted")) && > + (image = virJSONValueObjectGet(inserted, "image"))) { > + ret = qemuMonitorJSONDiskNameLookupOne(image, top, target); > + } > + break; > + } > + } > + if (!ret && !virGetLastError()) > + virReportError(VIR_ERR_INTERNAL_ERROR, > + _("unable to find backing name for device %s"), > + device); > + > + cleanup: > + virJSONValueFree(cmd); > + virJSONValueFree(reply); > + > + return ret; > +} > + > + > int qemuMonitorJSONArbitraryCommand(qemuMonitorPtr mon, > const char *cmd_str, > char **reply_str, ACK if you add the comment and fix the potential crash. I'm currently OK with accessing domain definition while it's unlocked (but guarded via the domain job) as I don't have an counter example where it wouldn't work correctly. Peter
Attachment:
signature.asc
Description: Digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list