On 04/13/2012 08:46 AM, Jiri Denemark wrote: > On Mon, Apr 09, 2012 at 21:52:21 -0600, Eric Blake wrote: >> Minimal patch to wire up all the pieces in the previous patches >> to actually enable a block copy job. By minimal, I mean that >> qemu creates the file (that is, no REUSE_EXT flag support yet), >> SELinux must be disabled, a lock manager is not informed, and >> the audit logs aren't updated. But those will be added as >> improvements in future patches. >> >> * src/qemu/qemu_driver.c (qemuDomainBlockCopy): New function. >> (qemuDomainBlockRebase): Call it when appropriate. >> --- >> src/qemu/qemu_driver.c | 114 +++++++++++++++++++++++++++++++++++++++++++++++- >> 1 files changed, 113 insertions(+), 1 deletions(-) >> + >> + priv = vm->privateData; >> + if (!(qemuCapsGet(priv->qemuCaps, QEMU_CAPS_DRIVE_MIRROR) && >> + qemuCapsGet(priv->qemuCaps, QEMU_CAPS_DRIVE_REOPEN))) { >> + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", > > We usually use VIR_ERR_CONFIG_UNSUPPORTED in such cases. Sure, easy to fix. Actually, looking at 'git grep -B3 -i "qemu binary" src/qemu', I counted an INTERNAL_ERROR, an OPERATION_FAILED, and a couple of OPERATION_INVALID that should all be cleaned up, to match the fact that the majority of uses did indeed favor CONFIG_UNSUPPORTED. I'll split the cleanup into a separate patch. > >> + _("block copy is not supported with this QEMU binary")); >> + goto cleanup; >> + } >> + if (vm->persistent) { >> + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", >> + _("domain is not transient")); >> + goto cleanup; >> + } > > I guess I wasn't paying enough attention somewhere but why do we forbid block > copy for persistent domains? I understand why we want to forbid certain > operations when block copy is active but I don't see a reason for penalizing > persistent domains. It was in patch 8/18 where I added the restrictions, and hopefully documented in that commit message why limiting things to transient is a good first step: 1. the first client of live migration is oVirt, which uses transient domains 2. qemu does not (yet) provide a way to resume a mirror when restarting a domain, so anything that would require restoring a domain from saved state is broken: incoming migration, virsh restore, and virsh start after a managed save. But users would be upset if they saved a domain, only to find out that they cannot then restore it, so I squelched things one step earlier in the process, by preventing any save of a domain so that we never have a broken save image in the first place. My worry now comes from the fact that managedsave is on the list of forbidden operations. If a domain is transient, managedsave is already forbidden (it is assumed that you either don't care about the domain if the host dies, or that you are running a higher-level app like oVirt that knows how to rebuild the guest on a different host). But if a guest is persistent, and you use the libvirt-guests init script, then you have a right to assume that rebooting your host will resume your guests in the same state that they were prior to the host going down - because libvirt-guests uses managedsave. If we allow a mirror job on a persistent domain, we violate this assumption (libvirt-guests will fail to save the guest). Therefore, I forbid to start a mirror job on a persistent domain, just as I forbid to 'virsh define' a transient domain into a persistent domain if a mirror job is active. If, at a later date, qemu comes up with a way to resume mirroring when restarting a domain, we can relax these restrictions. >> + >> + /* Actually start the mirroring */ >> + qemuDomainObjEnterMonitorWithDriver(driver, vm); >> + ret = qemuMonitorDriveMirror(priv->mon, NULL, device, dest, format, flags); >> + if (ret == 0 && bandwidth != 0) >> + ret = qemuMonitorBlockJob(priv->mon, device, NULL, bandwidth, NULL, >> + BLOCK_JOB_SPEED_INTERNAL); >> + qemuDomainObjExitMonitorWithDriver(driver, vm); > > Can BLOCK_JOB_SPEED_INTERNAL fail while block copy remains running? If so, we > should try to abort the job in case we fail to set the speed, otherwise we > will report an error and forget about the job while qemu will keep copying > data. Good catch, and virDomainBlockPull() suffers from the same ramifications. I think both code paths need the same fix. >> @@ -11889,6 +12000,7 @@ static int >> qemuDomainBlockPull(virDomainPtr dom, const char *path, unsigned long bandwidth, >> unsigned int flags) >> { >> + virCheckFlags(0, -1); >> return qemuDomainBlockRebase(dom, path, NULL, bandwidth, flags); >> } > > Hmm, personally, I would make qemuDomainBlockPull call qemuDomainBlockJobImpl > directly instead of going through qemuDomainBlockRebase while adding the flags > check... Easy enough to change. -- Eric Blake eblake@xxxxxxxxxx +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list