On Wed, May 18, 2022 at 14:53:50 +0200, Jiri Denemark wrote: > On Thu, May 12, 2022 at 13:50:08 +0200, Peter Krempa wrote: > > On Tue, May 10, 2022 at 17:21:11 +0200, Jiri Denemark wrote: > > > This phase marks a migration protocol as broken in a post-copy phase. > > > Libvirt is no longer actively watching the migration in this phase as > > > the migration API that started the migration failed. > > > > > > This may either happen when post-copy migration really fails (QEMU > > > enters postcopy-paused migration state) or when the migration still > > > progresses between both QEMU processes, but libvirt lost control of it > > > because the connection between libvirt daemons (in p2p migration) or a > > > daemon and client (non-p2p migration) was closed. For example, when one > > > of the daemons was restarted. > > > > > > Signed-off-by: Jiri Denemark <jdenemar@xxxxxxxxxx> > > > --- > > > src/qemu/qemu_migration.c | 15 +++++++++++---- > > > src/qemu/qemu_process.c | 16 +++++++++++++--- > > > 2 files changed, 24 insertions(+), 7 deletions(-) > > > > > > > [...] > > > > > > > @@ -6327,9 +6334,9 @@ qemuMigrationProcessUnattended(virQEMUDriver *driver, > > > vm->def->name); > > > > > > if (job == VIR_ASYNC_JOB_MIGRATION_IN) > > > - phase = QEMU_MIGRATION_PHASE_FINISH3; > > > + phase = QEMU_MIGRATION_PHASE_FINISH_RESUME; > > > else > > > - phase = QEMU_MIGRATION_PHASE_CONFIRM3; > > > + phase = QEMU_MIGRATION_PHASE_CONFIRM_RESUME; > > > > > > if (qemuMigrationJobStartPhase(vm, phase) < 0) > > > return; > > > > This hunk seems to be misplaced or at least doesn't really seem to be > > related to anything this patch is claiming to do. > > Thanks to changes in this patch migration is in > QEMU_MIGRATION_PHASE_POSTCOPY_FAILED phase when we get here so to make > it all work, we need to use RESUME phases > > QEMU_MIGRATION_PHASE_POSTCOPY_FAILED. Otherwise qemuMigrationCheckPhase > would complain. Perhaps this could be moved to "Add new migration phases > for post-copy recovery", but I haven't checked for sure. > > > > > [..] > > > > > @@ -3751,9 +3752,18 @@ qemuProcessRecoverMigration(virQEMUDriver *driver, > > > return -1; > > > > > > if (rc > 0) { > > > + job->phase = QEMU_MIGRATION_PHASE_POSTCOPY_FAILED; > > > + > > > if (migStatus == VIR_DOMAIN_JOB_STATUS_POSTCOPY) { > > > VIR_DEBUG("Post-copy migration of domain %s still running, it " > > > "will be handled as unattended", vm->def->name); > > > + > > > + if (state == VIR_DOMAIN_RUNNING) > > > + reason = VIR_DOMAIN_RUNNING_POSTCOPY; > > > + else > > > + reason = VIR_DOMAIN_PAUSED_POSTCOPY; > > > > This bit also doesn't seem to be justified by what this patch is > > supposed to do. > > Until now broken migration protocol in post-copy phase has been > indicated with VIR_DOMAIN_*_POSTCOPY_FAILED states, which changed in > this patch and we can use the right state here depending on the QEMU > state. That said, it could be moved into a separate patch. Fair enough on both points. Reviewed-by: Peter Krempa <pkrempa@xxxxxxxxxx>