On Fri, Oct 11, 2019 at 23:18:29 +0000, Jim Fehlig wrote: > I've been investigating a lockd lock ordering bug in a migration error handling > path in the libxl driver. In the perform phase, the src calls > virDomainLockProcessPause to release the lock before sending the VM to dst. In > this case the send fails for other reasons and an attempt is made to reacquire > the lock with virDomainLockProcessResume. But that fails since the dst has not > finished cleaning up the failed VM and releasing the lock it acquired when > starting to receive the VM. My immediate reaction was "why not reacquire the > lock in the confirm phase", but then I saw my older comment a few lines later in > the perform phase code > > /* > * Confirm phase will not be executed if perform fails. End the > * job started in begin phase. > */ > > Is that just a bug in the implementation, or is it intended to skip the confirm > phase if perform fails? It's intended. The Perform phase runs on the source hosts so why should we call Confirm to let the source know about the failure? But of course, the source has to cleanup after the failed migration similarly to what Confirm would do. Jirka -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list