During live migration of VMs from one host to another the VM is suspended for an unpredictable amount of time. The actual downtime depends on how many new pages will be dirty and the band width to the destination host. Since VM memory size grows faster than transfer rates the currently available tuneables will cause troubles for workloads within the VM which can not handle large timejumps. I have already written code to tweak the inner loop doing the actual migration work in libxc. But the patchset exposes the details of the loop to the cmdline, as such it is not portable nor is it a friendly UI for the hostadmin. Here is my proposal for a new option for virsh and 2 new options for xl: [xl | virsh --live] --max-suspend-time N --timeout N VM host --max-suspend-time N: as the name suggests, the VM downtime must not be longer than specified. The code doing the migration has to estimate the transfer speed. If the VM is about to be suspended, it has to check if the remaining dirty pages can be transfered within the required timeframe. If not, the migration is aborted, the VM continues to run on the src host, the new VM on the dst host is destroyed and an error is returned. --timeout N: if a VM is busy and its workload causes many new dirty pages the migrate command would take forever. This option is supposed to stop the migration attempt if the number of new dirty pages is too high. It would change the semantics of "virsh migrate --timeout n", which currently forces a suspend (according to the help text). I'm not sure if its acceptable to add this option just for the libxl (and maybe xend) target in libvirt, until someone steps up to do also the kvm part. For Xen it would be added for xl only, obviously. Olaf -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list