On 01 Oct 2014, at 12:07 , Jiri Denemark <jdenemar@xxxxxxxxxx> wrote: > On Wed, Oct 01, 2014 at 10:45:33 +0200, Cristian KLEIN wrote: >> On 2014-09-30 17:16, Daniel P. Berrange wrote: >>> On Tue, Sep 30, 2014 at 05:11:03PM +0200, Jiri Denemark wrote: >>>> On Tue, Sep 30, 2014 at 16:39:22 +0200, Cristian Klein wrote: >>>>> Signed-off-by: Cristian Klein <cristian.klein@xxxxxxxxx> >>>>> --- >>>>> include/libvirt/libvirt.h.in | 1 + >>>>> src/libvirt.c | 7 +++++++ >>>>> 2 files changed, 8 insertions(+) >>>>> >>>>> diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in >>>>> index 5217ab3..82f3aeb 100644 >>>>> --- a/include/libvirt/libvirt.h.in >>>>> +++ b/include/libvirt/libvirt.h.in >>>>> @@ -1225,6 +1225,7 @@ typedef enum { >>>>> VIR_MIGRATE_ABORT_ON_ERROR = (1 << 12), /* abort migration on I/O errors happened during migration */ >>>>> VIR_MIGRATE_AUTO_CONVERGE = (1 << 13), /* force convergence */ >>>>> VIR_MIGRATE_RDMA_PIN_ALL = (1 << 14), /* RDMA memory pinning */ >>>>> + VIR_MIGRATE_POSTCOPY = (1 << 15), /* enable (but don't start) post-copy */ >>>>> } virDomainMigrateFlags; >>>> >>>> I still think we should add an extra flag to start post copy >>>> immediately. To address your concerns about it, I don't think it's >>>> implementing a policy in libvirt. It's for apps that want to make sure >>>> migration converges without having to spawn another thread and monitor >>>> the progress or wait for a timeout. It's a bit similar to migrating a >>>> paused domain vs. migrating a running domain and pausing it when it >>>> doesn't seem to converge. >>> >>> Your point about spawning another thread makes me wonder if we should >>> actually look at adding a 'VIR_MIGRATE_ASYNC' method (that would require >>> P2P migration of course). If this flag were set, virDomainMigrateXXX would >>> only block for long enough to start the migration and then return. >>> >>> Callers can use the job info API to monitor progress & success/failure. >>> >>> Then we wouldn't have to keep adding flags like you suggest - apps can >>> just easily call the appropriate API right away with no threads needed >> >> This would make a lot of sense. The user would call: >> >> """ >> virDomainMigrateXXX(..., VIR_MIGRATE_POSTCOPY | VIR_MIGRATE_ASYNC) >> virDomainMigrateStartPostCopy(...) >> """ >> >> Would this be seen as more cumbersome than having a dedicated >> VIR_MIGRATE_POSTCOPY_AUTOSTART? > > The ASYNC flag Daniel suggested makes sense, so I guess you can just > ignore my request for a special flag. Although, I don't think the ASYNC > stuff needs to be done within this series, let's just focus on the > post-copy stuff. Hi Jirka, I talked to the qemu post-copy guys (Andrea and Dave in CC). Starting post-copy immediately is a bad performance choice: The VM will start on the destination hypervisor before the read-only or kernel memory is there. This means that those pages need to be pulled on-demand, hence a lot of overhead and interruptions in the VM’s execution. Instead, it is better to first do one pass of pre-copy and only then trigger post-copy. In fact, I did an experiment with a video streaming VM and starting post-copy after the first pass of pre-copy (instead of starting post-copy immediately) reduces downtime from 3.5 seconds to under 1 second. Given all above, I propose the following post-copy API in libvirt: virDomainMigrateXXX(..., VIR_MIGRATE_ENABLE_POSTCOPY) virDomainMigrateStartPostCopy(...) // from a different thread This is for those who just need the post-copy mechanism and want to implement a policy themselves. virDomainMigrateXXX(..., VIR_MIGRATE_POSTCOPY_AFTER_PRECOPY) This is for those who want to use post-copy without caring about any low-level details, offering a good enough policy for most cases. What do you think? Would you accept patches that implement this API? Cristian -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list