On Mon, Apr 16, 2012 at 23:06:01 -0600, Eric Blake wrote: > This is the bare minimum to end a copy job (of course, until a > later patch adds the ability to start a copy job, this patch > doesn't do much in isolation; I've just split the patches to > ease the review). > > This patch intentionally avoids SELinux, lock manager, and audit > actions. Also, if libvirtd restarts at the exact moment that a > 'drive-reopen' is in flight, the only proper way to detect the > outcome of that 'drive-reopen' would be to first pass in a witness > fd with 'getfd', then at libvirtd restart, probe whether that file > is still empty. This patch is enough to test the common case of > success when used correctly, while saving the subtleties of proper > cleanup for worst-case errors for later. > > When a mirror job is started, cancelling the job safely reverts back > to the source disk, regardless of whether the destination is in > phase 1 (streaming, in which case the destination is worthless) or > phase 2 (mirroring, in which case the destination is synced up to > the source at the time of the cancel). Our existing code does just > fine in either phase, other than some bookkeeping cleanup. > > Pivoting the job requires the use of the new 'drive-reopen' command. > Here, failure of the command is potentially catastrophic to the > domain, since the initial qemu implementation rips out the old disk > before attempting to open the new one; qemu will attempt a recovery > path of retrying the reopen on the original source, but if that also > fails, the domain is hosed, with nothing libvirt can do about it. > If qemu 1.2 ever adds 'drive-reopen' inside 'transaction', then the > problem will no longer exist (a transaction promises not to close > the old file until after the new file is proven to work), at which > point we would add a VIR_DOMAIN_REBASE_COPY_ATOMIC that fails up > front if we detect an older qemu with the risky drive-reopen. > > Interesting side note: while snapshot-create --disk-only creates a > copy of the disk at a point in time by moving the domain on to a > new file (the copy is the file now in the just-extended backing > chain), blockjob --abort of a copy job creates a copy of the disk > while keeping the domain on the original file. There may be > potential improvements to the snapshot code to exploit block copy > over multiple disks all at one point in time. And, if > 'block-job-cancel' were made part of 'transaction', you could > copy multiple disks at the same point in time without pausing > the domain. This also implies we may want to add a --quiesce flag > to the pivot operation, so that when breaking a mirror, the side > of the mirror that we are abandoning is at least in a stable state > with regards to guest I/O. > > * src/qemu/qemu_driver.c (qemuDomainBlockJobAbort): Accept new flag. > (qemuDomainBlockPivot): New helper function. > (qemuDomainBlockJobImpl): Implement it. > --- > > was 11/18 in v4 > v5: no real change, improve commit message > > src/qemu/qemu_driver.c | 106 +++++++++++++++++++++++++++++++++++++++++++++++- > 1 files changed, 105 insertions(+), 1 deletions(-) OK Jirka -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list