Re: [PATCH 0/3] libxl migration improvements

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 13.11.2014 03:36, Jim Fehlig wrote:
This series of patches fixes problems discovered in libxl migration.

The first patch fixes an issue that went undetected while testing the
initial implementation of migration.  Receiving migration data occurs
in the context of an event loop callback, effectively blocking the
event loop during the entire migration process.  The patch moves the
work of receiving migration data to a thread.

Interestingly, this issue manifested in a failed migration due to failed
keepalives, which would kill virsh's connection to dst host.  The dst host
failed to respond to keepalives since its event loop was blocked on
receiving migration data.  Ultimately the migration perform phase would
succeed leaving a running domain on dst.  However, the subsequent finish
phase would fail since virsh's connection to dst had been killed by the
keepalive failure.  Since finish failed, the confirm phase would resume
the domain on src.  Yikes! Same domain running on two different hosts :(.

Patches 2 and 3 improve handling of errors in the event the perform or
finish phases of migration fail.  See the individual patches for details.


Jim Fehlig (3):
   libxl: Receive migration data in a thread
   libxl: start domain paused on migration dst
   libxl: destroy domain in migration finish phase on failure

  src/libxl/libxl_migration.c | 75 ++++++++++++++++++++++++++++++---------------
  1 file changed, 51 insertions(+), 24 deletions(-)


ACK series, but see my comment to 1/3.

Michal

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]