On Mon, Jun 15, 2015 at 08:36:47PM -0600, Jim Fehlig wrote:
Similar to commit 540c339a for the QEMU driver, rework reference counting in the libxl driver to make it more deterministic and the code a bit cleaner. Signed-off-by: Jim Fehlig <jfehlig@xxxxxxxx> --- I've been testing this patch on and off for a few weeks now using libvirt-tck domain tests, local test scripts, and some manual tests for good measure. I sent the patch to Anthony Perard (cc'd) nearly two weeks ago for testing in his OpenStack+Xen+libvirt CI loop, although I haven't received any feedback thus far. Also included Martin in the cc since he encouraged this patch
I'm sorry I didn't get to it earlier. I'm not familiar with the libxl driver, but I'll at least give it a try since this is on the list for some time.
https://www.redhat.com/archives/libvir-list/2015-April/msg00014.html src/libxl/libxl_domain.c | 32 ++---- src/libxl/libxl_domain.h | 5 +- src/libxl/libxl_driver.c | 274 ++++++++++++++++++-------------------------- src/libxl/libxl_migration.c | 6 +- 4 files changed, 128 insertions(+), 189 deletions(-) diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 0652270..ce188ee 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -294,6 +294,25 @@ libxlDomObjFromDomain(virDomainPtr dom) return vm; } +/* + * Finish working with a domain object in an API. This function + * clears whatever was left of a domain that was gathered using + * libxlDomObjFromDomain(). Currently that means only unlocking and + * decrementing the reference counter of that domain. And in order to + * make sure the caller does not access the domain, the pointer is + * cleared. + */ +static void +libxlDomObjEndAPI(virDomainObjPtr *vm) +{
This function already exists as virDomainObjEndAPI().
+ if (!*vm) + return; + + virObjectUnlock(*vm); + virObjectUnref(*vm); + *vm = NULL; +} + static int libxlAutostartDomain(virDomainObjPtr vm, void *opaque) @@ -2760,8 +2729,7 @@ libxlDomainUndefineFlags(virDomainPtr dom, cleanup: VIR_FREE(name); - if (vm) - virObjectUnlock(vm); + libxlDomObjEndAPI(&vm);
In this function the domain can leak because the reference you get by libxlDomObjFromDomain() is not returned anywhere. You might get to this cleanup phase with a domain that has a reference count, but vm is set to NULL. virDomainObjListRemove() leaves the domain unlocked, Unless I missed something this will leak. Stream of Random thoughts follows. Not setting 'vm = NULL' will help a bit, but in that case the domain will be unlocked twice. At first I thought maybe something similar to how QEMU driver does that in qemuDomainRemoveInactive() might work. But after looking at it for a while, I see this bug is in QEMU, the domain will be unlocked twice there. I need to send a fix for that because now we are resetting a job when the domain is unlocked. That leads me to an idea. Could we somehow make most of the things that are so similar abstracted somehow? That's not related to this patch though. But for example that would fix stuff like the fact that libxl doesn't hold a job when removing the domain.
@@ -4963,23 +4915,20 @@ libxlDomainMigrateFinish3Params(virConnectPtr dconn, NULLSTR(dname)); return NULL; } + virObjectRef(vm); - if (virDomainMigrateFinish3ParamsEnsureACL(dconn, vm->def) < 0) { - virDomainObjEndAPI(&vm); - return NULL; - } + if (virDomainMigrateFinish3ParamsEnsureACL(dconn, vm->def) < 0) + goto cleanup; - if (libxlDomainObjBeginJob(driver, vm, LIBXL_JOB_MODIFY) < 0) { - virDomainObjEndAPI(&vm); - return NULL; - } + if (libxlDomainObjBeginJob(driver, vm, LIBXL_JOB_MODIFY) < 0) + goto cleanup; ret = libxlDomainMigrationFinish(dconn, vm, flags, cancelled); - if (!libxlDomainObjEndJob(driver, vm)) - vm = NULL; + libxlDomainObjEndJob(driver, vm); - virDomainObjEndAPI(&vm); + cleanup: + libxlDomObjEndAPI(&vm);
Wow, and it was called here, how funny :) Although that might mean something is wrong since you're not adding unref, but added ref on top of this hunk. Martin P.S.: Now 'make check' fails for me due to false-positives about mismatched ACL functions from migrate APIs, but that might be due to poor rebase on top of master which I did.
Attachment:
signature.asc
Description: PGP signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list