On Fri, Feb 13, 2015 at 16:24:30 +0100, Michal Privoznik wrote: > https://bugzilla.redhat.com/show_bug.cgi?id=1179678 > > When migrating with storage, libvirt iterates over domain disks and > instruct qemu to migrate the ones we are interested in (shared, RO and > source-less disks are skipped). The disks are migrated in series. No > new disk is transferred until the previous one hasn't been quiesced. > This is checked on the qemu monitor via 'query-jobs' command. If the > disk has been quiesced, it practically went from copying its content > to mirroring state, where all disk writes are mirrored to the other > side of migration too. Having said that, there's one inherent error in > the design. The monitor command we use reports only active jobs. So if > the job fails for whatever reason, we will not see it anymore in the > command output. And this can happen fairly simply: just try to migrate > a domain with storage. If the storage migration fails (e.g. due to > ENOSPC on the destination) we resume the host on the destination and > let it run on partly copied disk. > > The proper fix is what even the comment in the code says: listen for > qemu events instead of polling. If storage migration changes state an > event is emitted and we can act accordingly: either consider disk > copied and continue the process, or consider disk mangled and abort > the migration. > > Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx> > --- > src/qemu/qemu_migration.c | 37 +++++++++++++++++-------------------- > 1 file changed, 17 insertions(+), 20 deletions(-) ACK, please push patches 1..3 since I realized turning all this into a condition is not going to be that easy... Jirka -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list