On Thu, Oct 26, 2017 at 09:14:02AM -0400, John Ferlan wrote: > > > On 10/25/2017 05:48 AM, John Ferlan wrote: > > > > > > On 10/25/2017 05:15 AM, Nikolay Shirokovskiy wrote: > >> > >> > >> On 25.10.2017 12:06, John Ferlan wrote: > >>> > >>> > >>> On 10/24/2017 06:47 AM, Nikolay Shirokovskiy wrote: > >>>> After virNetDaemonAddServerPostExec call in virtlogd we should have > >>>> netserver refcount set to 2. One goes to netdaemon servers hashtable > >>>> and one goes to virtlogd own reference to netserver. Let's add > >>>> missing increment in virNetDaemonAddServerPostExec itself while holding > >>>> daemon lock. > >>>> > >>>> We also have to unref new extra ref after virtlockd call to virNetDaemonAddServerPostExec. > >>>> --- > >>>> src/locking/lock_daemon.c | 1 + > >>>> src/rpc/virnetdaemon.c | 1 + > >>>> 2 files changed, 2 insertions(+) > >>>> > >>>> diff --git a/src/locking/lock_daemon.c b/src/locking/lock_daemon.c > >>>> index fe3eaf9..41a06b2 100644 > >>>> --- a/src/locking/lock_daemon.c > >>>> +++ b/src/locking/lock_daemon.c > >>>> @@ -275,6 +275,7 @@ virLockDaemonNewPostExecRestart(virJSONValuePtr object, bool privileged) > >>>> virLockDaemonClientFree, > >>>> (void*)(intptr_t)(privileged ? 0x1 : 0x0)))) > >>>> goto error; > >>>> + virObjectUnref(srv); > >>> > >>> I need to think a bit more about this one... different model in lockd > >>> vs. logd over @srv usage especially w/r/t this particular path. > >>> > >>> I see that in this path eventually something calls virNetDaemonGetServer > >>> instead of storing the @srv in order to add lockProgram... In any case,> I guess I'd be worried/concerned that something could remove @srv > >>> causing the Hash table code to unref/delete the srv... Also, in the > >>> cleanup: path of main() wouldn't the Unref(srv) cause problems? > >> > >> But this unref only balance newly added ref. If there are other > >> problems with reference conting in virtlockd - its a different > >> concern. > >> > > > > Ok - so what I've learned is that virLockDaemonPostExecRestart returns > > rv == 1 which indicates a successful restore resulting in a call to > > virNetDaemonGetServer which will do a virObjectRef anyway leaving us > > with (theoretically) 2 refs prior to the code that adds programs to @srv > > (and less concern from my part of the subsequent Unref in cleanup: here). > > > > If we saved srv in lockd like logd does, then we wouldn't need to call > > virNetDaemonGetServer resulting in the same eventual result except for a > > window after the Unref here where the refcnt == 1 until > > virNetDaemonGetServer is run. Since I believe nothing could happen in > > between that would cause the Unref due to HashFree being called, then I > > think we're OK. > > > > The concern being is if virNetDaemonGetServer didn't find the server, > > then @srv would be NULL and the subsequent call to > > virNetServerAddProgram for lockProgram would fail miserably, but I don't > > think that can happen theoretically at least! > > > > John > > > > > >>> > >>> Again- need to think a bit longer. > > I've been thinking more about this more... > > Looking through history - I see commit id f08b1c58 removed @srv from > _virLockDaemon, but _virLogDaemon wasn't adjusted likewise. > > I also see commit id 252610f7 which modified the virnetdaemon code to > use hash tables. What I'm missing is why that patch didn't add the > virObjectRef as well since the @srv is placed into the hash table to > match the similar operation done for the virNetDaemonAddServer. > > It seems when first written perhaps 252610f7 came first with f08b1c58 > afterwards, but when committed it was in reverse order. > > There's just something not quite right and I'm not quite sure what the > exact right patch(es) should be. > > "Theoretically speaking"... > > If we get a REF for alloc, then we need an UNREF > If we get a REF for hash, then we need an UNREF > If we get a REF for virNetDaemonGetServer, then we need an UNREF > > Without the REF in PostExec, then LogD will have a problem if there's a > usage of logd->srv after the UNREF for the hash table happens. In > particular, the UNREF in virLogDaemonFree. However, for LockD there's > not a problem because there's REF's for GetServer and no UNREF on the > allocation. For the "non"-PostExec path - there's a leak of @srv > (theoretical) and what seems to be normal processing for the PostExec path. > > With the REF in PostExec, LogD would be fixed; however, LockD then has a > leak because of the extra REF. > > > Pushing patch1 seems to solve at least the problem w/ the late/delayed > client access (as shown by the sleep() comments), but I fear it opens > the door for a LogD problem because hash table ref gets decremented sooner. > > It's all very tricky and timing based. > > Perhaps someone else wants to have a look too and provide some thoughts? I'll definitely have a look at the series, but being at KVM Forum, my workflow is broken so to speak, hopefully I'll manage to get through this thoroughly before you push this. Erik -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list