On 9/9/19 5:17 PM, Daniel Henrique Barboza wrote: > > > On 9/9/19 5:52 PM, Eric Blake wrote: >> Commit f10562799 introduced a regression: if reverting to a snapshot >> fails early (such as when we refuse to revert to an external >> snapshot), we lose track of the domain's current snapshot. >> >> See: https://bugzilla.redhat.com/1738747 >> Signed-off-by: Eric Blake <eblake@xxxxxxxxxx> >> --- > > I don't understand why f10562799 broke this code like this - tried > looking the changes made in the commit and the "if (snap)" was > there since a long time, no changes were made in the 'snap' > variable as well - but this change didn't break anything else, so: > > Reviewed-by: Daniel Henrique Barboza <danielhb413@xxxxxxxxx> I'll update the commit message to give more details: Before that patch, we maintained the notion of the current snapshot in two places: vm->current_snapshot, and snap->def->current. If you have a domain with two snapshots, s1 and s2 (where s2 is current) and want to revert to s1, the old code cleared the def->current flag on s2 before noticing that reverting to s1 was not possible, but at least left vm->current_snapshot unchanged. That meant we had a _different_ bug - the code was inconsistent on whether the domain had a current snapshot (as long as libvirtd didn't shut down, vm->current_snapshot ruled, so s2 was still claimed as current; but if you restart libvirtd, the XML for s2 didn't track that it was current, so after restart you'd have no current snapshot). After that patch, the code was unconditionally changing vm->current_snapshot to NULL (but at least was consistent everywhere else that the XML never got out of sync with the notion of which snapshot was current). It also didn't help that after that patch, the code clearing the snapshot occurs twice in the function - once right after determining that the early checks have succeeded, the other unconditionally on all failure paths. So the fix is as simple as removing the unconditional clearing of s2 as the current snapshot in the cleanup code, in favor of the earlier clearing that happens only after the early checks succeed. > > > ps: I think adding a test case for this failure (in tests/virsh-snapshot ?) > is worth considering, especially considering that we changes from > from Maxiwell (adding an inactive XML to the snapshot) that can > add up more complexity in the snapshot mechanics. I tried. But the test driver doesn't forbid 'snapshot-revert' on external snapshots, and also doesn't try to write state to XML files, so it never copied the problematic code from the qemu driver that could even trigger the bug. > >> src/qemu/qemu_driver.c | 2 -- >> 1 file changed, 2 deletions(-) >> >> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c >> index b28a26c3d6..093b15f500 100644 >> --- a/src/qemu/qemu_driver.c >> +++ b/src/qemu/qemu_driver.c >> @@ -16941,8 +16941,6 @@ >> qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, >> virDomainSnapshotSetCurrent(vm->snapshots, NULL); >> ret = -1; >> } >> - } else if (snap) { >> - virDomainSnapshotSetCurrent(vm->snapshots, NULL); >> } >> if (ret == 0 && config && vm->persistent && >> !(ret = virDomainSaveConfig(cfg->configDir, driver->caps, > > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list