2011/7/27 Matthias Bolte <matthias.bolte@xxxxxxxxxxxxxx>: > doRemoteClose doesn't free the virNetClientPtr and this creates a > 260kb leak per remote connection. This happens because > virNetClientFree doesn't remove the last ref, because virNetClientNew > creates the virNetClientPtr with a refcount of 2. > > static virNetClientPtr virNetClientNew(virNetSocketPtr sock, > const char *hostname) > { > [...] > client->refs = 1; > [...] > /* Set up a callback to listen on the socket data */ > client->refs++; > if (virNetSocketAddIOCallback(client->sock, > VIR_EVENT_HANDLE_READABLE, > virNetClientIncomingEvent, > client, > virNetClientEventFree) < 0) { > client->refs--; > VIR_DEBUG("Failed to add event watch, disabling events"); > } > [...] > } > > virNetClientNew adds a ref before calling virNetSocketAddIOCallback > but only removes it when virNetSocketAddIOCallback fails. This seems > wrong too me and the ref should be removed in the success case too. > > The same pattern was added in 0302391ee643ad91fdc6d2ecf7e4cf0fc9724516 > (Fix race in ref counting when handling RPC jobs) > > --- a/src/rpc/virnetserverclient.c > +++ b/src/rpc/virnetserverclient.c > @@ -763,10 +763,12 @@ readmore: > > /* Send off to for normal dispatch to workers */ > if (msg) { > + client->refs++; > if (!client->dispatchFunc || > client->dispatchFunc(client, msg, > client->dispatchOpaque) < 0) { > virNetMessageFree(msg); > client->wantClose = true; > + client->refs--; > return; > } > } > > Again, this seems wrong and the ref should be removed in the success > case here too. Before I spent time to figure out how the refcounting > is supposed to work here I just report it and hope that Dan can easily > fix this. Okay, after some discussion on IRC with Michal and Eric and further testing I think that the ref counting is okay here. virNetClientNew adds the second ref because the free callback (virNetClientEventFree) passed to virNetSocketAddIOCallback will call virNetClientFree. Another observation is that virNetClientFree calls virNetSocketRemoveIOCallback when the last ref was removed. Actually that's pointless because virNetSocketRemoveIOCallback might indirectly removed the last ref. So this looks like an ordering problem. The ordering gets fixed by calling virNetClientClose before virNetClientFree, because virNetClientClose calls virNetSocketRemoveIOCallback. Another problem is that virNetSocketRemoveIOCallback only indirectly results in a call to virNetClientFree, because the event loop will call virNetClientFree when removing callbacks marked for deletion -- virNetSocketRemoveIOCallback does this marking. The memory leak I saw was due to virsh calling virEventRegisterDefaultImpl, but not calling virEventRunDefaultImpl. Because the event loop is initialized, the call to virNetSocketAddIOCallback in virNetClientNew succeeds. But virsh not driving the event loop results in not removing callbacks that were marked for deletion. Finally this leaks the virNetClientPtr using in the remote driver. I used a virsh -c qemu:///system to test. I was able fix this by calling virEventRunDefaultImpl after virConnectClose in virsh. But I don't think that this is the correct general solution. To me the general problem seems to be the entanglement between the virNetClientPtr refcounting and the event loop. I'm not sure how to improve this in general. Maybe should have a thread calling virEventRunDefaultImpl as the comment on virEventRunDefaultImpl suggest. -- Matthias Bolte http://photron.blogspot.com -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list