On 03/19/2012 10:18 AM, Jiri Denemark wrote: > Destination daemon should not rely on the client or source daemon > (depending on the type of migration) to call Finish when migration > fails, because the client may crash before it can do so. The domain > prepared for incoming migration is set to be destroyed (and migration > job cleaned up) when connection with the client closes but this is not > enough. If the associated qemu process crashes after Prepare step and > the domain is cleaned up before the connection gets closed, autodestroy > is not called for the domain and migration jobs remains set. In case the > domain is defined on destination host (i.e., it is not completely > removed once destroyed) we keep the job set for ever. To fix this, we > register a cleanup callback which is responsible to clean migration-in > job when a domain dies anywhere between Prepare and Finish steps. Note > that we can't blindly clean any job when spotting EOF on monitor since > normally an API is running at that time. > --- > src/qemu/qemu_domain.c | 2 -- > src/qemu/qemu_domain.h | 2 ++ > src/qemu/qemu_migration.c | 22 ++++++++++++++++++++++ > 3 files changed, 24 insertions(+), 2 deletions(-) I'm restating my understanding of the bug, to make sure I am sure why your patch helps: - src requests a migration - dest starts a qemu process using information from the src, but the destination happens to be running an older qemu that can't support the full migration - qemu dies, but the destination hasn't seen a 'Finish' from the source, so the job remains open and the domain remains - connection is broken, but the open job prevents reclaiming the autodestroy domain on the destination - new connection is made, but source can't migrate because destination is already locked up on the stale attempt and the fix is adding a new callback, which says if qemu dies while the callback is registered, we cancel the migration job; therefore, even without a 'Finish' from the source, the autodestroy can now kick in ACK. -- Eric Blake eblake@xxxxxxxxxx +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list