On 12/21/18 3:23 AM, Nikolay Shirokovskiy wrote: > > > On 20.12.2018 23:31, John Ferlan wrote: >> >> >> On 12/13/18 3:03 AM, Nikolay Shirokovskiy wrote: >>> In case of active persistent domain snapshot metadata is >>> not complete. We save only active configuration and on >>> restore use it both for active and inactive configuration. >>> Let's fix it and save and restore both in this case. >>> >>> In case of active transient or inactive domain we have >>> only one config and thus everything is good. >>> >>> By the way this patch fixes config memleak on error paths. >> >> use @config to make it clearer... >> >> I see how/why you combined things and although extracting out the >> memleak is a bit difficult, I think it should be done. >> >> Perhaps in the cleanup code to *SaveConfig, you just set @config = NULL >> "temporarily" since we know/assume virDomainObjAssignDef was successful >> and we wouldn't want the subsequently added virDomainDefFree(config) to >> free it on those paths, but we would want it to be free'd on error >> paths. Then by using @newConfig in this patch, it's even more obvious >> and of course the temporary config = NULL would be removed... Hope this >> makes sense. > > Ok. I thought of a distinct patch too, but having in mind that adding/removing > code in subsequent patches is not preferred I decided to just note this moment > in the message. > I'm mostly 50/50 on this - leaning towards separation mainly because I think it is possible, but also because when you mention adding a new feature and fixing an existing memory leak in the commit message that generally results in someone asking for separation. Certainly makes it easier to bisect to a commit and not have to wonder did the new functionality or the bugfix cause some issue... >> >>> >>> Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@xxxxxxxxxxxxx> >>> --- >>> src/conf/snapshot_conf.c | 1 + >>> src/conf/snapshot_conf.h | 2 ++ >>> src/qemu/qemu_driver.c | 44 +++++++++++++++++++++++++++++++++++++++----- >>> 3 files changed, 42 insertions(+), 5 deletions(-) >>> >> >> Not my area of expertise, but rather than have it sit unreviewed... >> >> >>> diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c >>> index 5a511b4..2e4a0b9 100644 >>> --- a/src/conf/snapshot_conf.c >>> +++ b/src/conf/snapshot_conf.c >>> @@ -102,6 +102,7 @@ void virDomainSnapshotDefFree(virDomainSnapshotDefPtr def) >>> virDomainSnapshotDiskDefClear(&def->disks[i]); >>> VIR_FREE(def->disks); >>> virDomainDefFree(def->dom); >>> + virDomainDefFree(def->persistDom); >>> virObjectUnref(def->cookie); >>> VIR_FREE(def); >>> } >>> diff --git a/src/conf/snapshot_conf.h b/src/conf/snapshot_conf.h >>> index 20a42bd..3da204a 100644 >>> --- a/src/conf/snapshot_conf.h >>> +++ b/src/conf/snapshot_conf.h >>> @@ -75,6 +75,8 @@ struct _virDomainSnapshotDef { >>> virDomainSnapshotDiskDef *disks; >>> >>> virDomainDefPtr dom; >>> + /* inactive domain config in case of active persistent domain */ >>> + virDomainDefPtr persistDom; >>> >>> virObjectPtr cookie; >>> >>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c >>> index f7c1d6f..7e0585d 100644 >>> --- a/src/qemu/qemu_driver.c >>> +++ b/src/qemu/qemu_driver.c >>> @@ -15687,6 +15687,12 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, >>> VIR_DOMAIN_DEF_PARSE_SKIP_VALIDATE))) >>> goto endjob; >>> >>> + if (virDomainObjIsActive(vm) && vm->persistent && >> >> As it's easy to get lost or forget... IIRC vm->persistent means that w> have used the Define functions to provide a "config" at some point in > >> time. Then eventually when/if we modify the "active" object to make a >> configuration specific change, then the @newDef gets populated with the >> config change(s); otherwise, the changes are made to the "active" def >> and eventually/possibly saved. All the magic is hidden away in >> accessors that I don't think about all that often any more. > > If by otherwise you mean "if we make changes to active config" - then > such changes are saved to status files but they can't be written > directly to config file. > Like I said - some of the magic and details are lost in accessors. Yes, the active is essentially saved to status... The whole def/newDef fetch, store, and manipulation gets hidden behind those *Domain.*Persistent* type helpers. The @newDef nomenclature has always struck me as a bit odd, but using a longer more descriptive name may not help me either. >> >> So, should the vm->persistent be vm->newDef instead? Either way if the >> following call is made and vm->newDef == NULL, then eventually in >> virDomainDefFormatInternal we would core when @def was NULL. > > Well there is at least one case when vm->newDef != NULL but domain is not persistent > - if we undefine domain vm->newDef is not released. This can be fixed but > I'm not sure there are no other cases. So checking vm->persisent is more reliable. So we have an active persistent domain with a newDef that eventually is made non persistent... I'm sure there's a reason for it... > > I think this is currenly invariant that if virDomainObjIsActive(vm) && vm->persistent > then vm->newDef != NULL but we can add explicit check. > > The best option I think is to fix all cases when vm->newDef != NULL for non persistent > active domain and replace check as you suggested. What do you think? > I think my main concern would be vm->persistent == true and vm->newDef == NULL. > >> >> >>> + !(def->persistDom = qemuDomainDefCopy(driver, vm->newDef, >>> + VIR_DOMAIN_XML_SECURE | >>> + VIR_DOMAIN_XML_MIGRATABLE))) >> >> Do we need VIR_DOMAIN_XML_INACTIVE too since this is the "next" >> persistent config def? A variant for parse is used later in patch5. > > This is one of the hardest questions - what combinations of flags one should > use for one's purpuses :) I'm not kidding, there are no good definitions of 110% agree. > VIR_DOMAIN_XML_INACTIVE or VIR_DOMAIN_DEF_FORMAT_INACTIVE. So I use same > combination migration uses to pass persistent config to destination - > check qemuMigrationSrcRun. and obviously I asked because I find the design of the flags is confusing and what each means with respect to what post parse processing or validation occurs - I just cannot keep it in memory. > > Hmm, virDomainDefFormatInternal has next hunk at the beginning: > > if (def->id == -1) > flags |= VIR_DOMAIN_DEF_FORMAT_INACTIVE; > > and def->id == -1 for inactive config (@id is set to -1 on parsing > inactive configs and on domain start @def is copied to @newDef before > allocating @id - check qemuProcessInit). So we have this flag anyway :) <sigh>... commit f24e67d24 > > Also even the above fact does not have the case AFAIU having this flag for inactive config > does not have much meaning - random hunks of code that I checked filter > dumping info that gained by config on domain start. However I run across a > kind of exception too - @id is not dumped at all if VIR_DOMAIN_DEF_FORMAT_INACTIVE > is set however this is not make much difference. > The case you're chasing though would have an id != -1 since it's an active guest, but we're saving the newDef/next config. Perhaps similar to how various qemu_driver commands can manipulate both active and config (attach, detach, change, etc). It's a "hypervisor dependent" action as to what happens. >> >>> + goto endjob; >>> + >> >> Since we are doing this as "best chance" and aren't requiring it, what >> are your (and perhaps others) thoughts related to making this really >> optional. As in if the qemuDomainDefCopy fails, then we just throw a >> VIR_INFO, clear the last error, and continue on? >> >> I don't have a preference, but I'm just throwing it out there as an idea. > > I should say I'm not looking at this step as "best chance" - if we can > not revert exact state I'd prefer to fail here. > >> >>> if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) { >>> align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL; >>> align_match = false; >>> @@ -16219,6 +16225,7 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, >>> qemuDomainObjPrivatePtr priv; >>> int rc; >>> virDomainDefPtr config = NULL; >>> + virDomainDefPtr persistConfig = NULL; >>> virQEMUDriverConfigPtr cfg = NULL; >>> virCapsPtr caps = NULL; >>> bool was_stopped = false; >>> @@ -16226,6 +16233,7 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, >>> virCPUDefPtr origCPU = NULL; >>> unsigned int start_flags = VIR_QEMU_PROCESS_START_GEN_VMID; >>> qemuDomainAsyncJob jobType = QEMU_ASYNC_JOB_START; >>> + virDomainDefPtr newConfig = NULL; >>> >>> virCheckFlags(VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING | >>> VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED | >>> @@ -16332,6 +16340,11 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, >>> goto endjob; >>> } >>> >>> + if (snap->def->persistDom && >>> + !(persistConfig = virDomainDefCopy(snap->def->persistDom, caps, >>> + driver->xmlopt, NULL, true))) >>> + goto endjob; >>> + >> >> Based only on my previous comment, for this part I agree especially >> since if we were able to save a config, then we'd want to be sure to >> restore it. >> >>> cookie = (qemuDomainSaveCookiePtr) snap->def->cookie; >>> >>> switch ((virDomainState) snap->def->state) { >>> @@ -16440,8 +16453,18 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, >>> * failed loadvm attempt? */ >>> goto endjob; >>> } >>> - if (config) { >>> - virDomainObjAssignDef(vm, config, false, NULL); >>> + >>> + /* Older versions do not save inactive config in metadata, instead >>> + * they use active config for this purpose, so keep this behaviour >>> + * for backward compat. >>> + */ >>> + if (persistConfig) >>> + VIR_STEAL_PTR(newConfig, persistConfig); >>> + else >>> + VIR_STEAL_PTR(newConfig, config); >>> + >>> + if (newConfig) { >>> + virDomainObjAssignDef(vm, newConfig, false, NULL); >>> virCPUDefFree(priv->origCPU); >>> VIR_STEAL_PTR(priv->origCPU, origCPU); >>> } >>> @@ -16449,8 +16472,10 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, >>> /* Transitions 2, 3 */ >>> load: >>> was_stopped = true; >>> - if (config) >>> + if (config) { >>> virDomainObjAssignDef(vm, config, false, NULL); >>> + config = NULL; >> >> This "makes sense" except for the case where @persistConfig == NULL, >> then we wouldn't virDomainSaveConfig below as we currently do. So maybe >> this should be: >> >> if (!persistConfig) >> VIR_STEAL_PTR(newConfig, config); >> else >> config = NULL; >> >> That way we won't virDomainDefFree @config below (which is what you're >> trying to avoid), but we will save @config as we did before unless of >> course @persistConfig is going to be used. >> >> Make sense? > > Good catch! I think I'll change to: > > if (config) { > virDomainObjAssignDef(vm, config, false, NULL); > VIR_STEAL_PTR(newConfig, config); > } > > because if we have persistConfig but fail to assing it - means > we are on error path and we won't save newConfig anyway. > OK - I was going explicit with the alternative to the code below for persistConfig/newConfig. >> >>> + } >>> >>> /* No cookie means libvirt which saved the domain was too old to >>> * mess up the CPU definitions. >>> @@ -16471,6 +16496,11 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, >>> detail); >>> if (rc < 0) >>> goto endjob; >>> + >>> + if (persistConfig) { >> >> Perhaps we should point out: >> >> /* This will save the @persistConfig in the vm->newDef where it >> * was originally copied from. */ >> >> John >> >>> + virDomainObjAssignDef(vm, persistConfig, false, NULL); >>> + VIR_STEAL_PTR(newConfig, persistConfig); >>> + } >>> } > > > This makes me wonder if virDomainObjAssignDef is too overcomplicated > if we need such comments. > Nah, no way that's over complicated <eyeroll>. Touching it though means you get to own making it better... job security and all that. John > Nikolay > >>> >>> /* Touch up domain state. */ >>> @@ -16535,8 +16565,10 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, >>> qemuDomainRemoveInactive(driver, vm); >>> goto endjob; >>> } >>> - if (config) >>> + if (config) { >>> virDomainObjAssignDef(vm, config, false, NULL); >>> + VIR_STEAL_PTR(newConfig, config); >>> + } >>> >>> if (flags & (VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING | >>> VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED)) { >>> @@ -16600,7 +16632,7 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, >>> } else if (snap) { >>> snap->def->current = false; >>> } >>> - if (ret == 0 && config && vm->persistent && >>> + if (ret == 0 && newConfig && vm->persistent && >>> !(ret = virDomainSaveConfig(cfg->configDir, driver->caps, >>> vm->newDef ? vm->newDef : vm->def))) { >>> detail = VIR_DOMAIN_EVENT_DEFINED_FROM_SNAPSHOT; >>> @@ -16616,6 +16648,8 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, >>> virObjectUnref(cfg); >>> virNWFilterUnlockFilterUpdates(); >>> virCPUDefFree(origCPU); >>> + virDomainDefFree(config); >>> + virDomainDefFree(persistConfig); >>> >>> return ret; >>> } >>> -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list