Harsha, I haven't gotten around looking at the valgrind output. I am not sure if I will be able to do it soon since I am travelling next week. Are you seeing an equal no. of disconnect messages in glusterd logs? What is the ip:port you observe in the RPC_CLNT_CONNECT messages? Could you attach the logs to the bug? thanks, Krish ----- Original Message ----- > This is a small memory system like 1024M and a disk space for the > volume is 9gig, i do not think it has anything to do with AFR per se - > same bug is also reproducible on the bricks, nfs server too. Also it > might be that we aren't able to capture glusterdumps on non Linux > platforms properly - one of reasons i used Valgrind output. > > In Valgrind it indicates about 'lost memory' blocks - You can see the > screenshots too which indicate memory ramp ups in seconds with no i/o, > in-fact no data on the volume. > > The work-around i have seen to contain this issue is to disable > self-heal-deamon and NFS - after that the memory remains proper. On an > interesting observation after running Gluster management daemon in > debugging more - i can see that > > RPCLNT_CONNECT event() is constantly being triggered - which should > only occur once?? per process notification? > > > On Thu, Jul 17, 2014 at 3:38 AM, Krishnan Parthasarathi > <kparthas@xxxxxxxxxx> wrote: > > Harsha, > > > > I don't actively work on AFR, so I might have missed some things. > > I looked for the following things in the statedump for any memory > > allocation > > related oddities. > > 1) grep "pool-misses" *dump* > > This tells us if there were any objects whose allocated mem-pool wasn't > > sufficient > > for the load it was working under. > > I see that the pool-misses were zero, which means we are doing good with > > the mem-pools we allocated. > > > > 2) grep "hot-count" *dump* > > This tells us the no. of objects of any kind that is 'active' in the > > process while the state-dump > > was taken. This should allow us to see if the numbers we see are > > explicable. > > I see the maximum hot-count across statedumps of processes is 50, which > > isn't alarming or pointing any obvious memory leaks. > > > > The above observations indicate that some object that is not mem-pool > > allocated is being leaked. > > > > Hope this helps, > > Krish > > > > ----- Original Message ----- > >> Here you go KP - https://bugzilla.redhat.com/show_bug.cgi?id=1120570 > >> > >> On Thu, Jul 17, 2014 at 12:37 AM, Krishnan Parthasarathi > >> <kparthas@xxxxxxxxxx> wrote: > >> > Harsha, > >> > > >> > In addition to the valgrind output, statedump output of glustershd > >> > process > >> > when the leak is observed would be really helpful. > >> > > >> > thanks, > >> > Krish > >> > > >> > ----- Original Message ----- > >> >> Nope spoke too early, using poll() has no effect on the memory usage > >> >> on Linux, so actually back to FreeBSD. > >> >> > >> >> On Thu, Jul 17, 2014 at 12:07 AM, Harshavardhana > >> >> <harsha@xxxxxxxxxxxxxxxxxx> wrote: > >> >> > KP, > >> >> > > >> >> > I do have a 3.2Gigs worth of valgrind output which indicates this > >> >> > issue, trying to reproduce this on Linux. > >> >> > > >> >> > My hunch says that 'compiling' with --disable-epoll might actually > >> >> > trigger this issue on Linux too. Will update here > >> >> > once i have done that testing. > >> >> > > >> >> > > >> >> > On Wed, Jul 16, 2014 at 11:44 PM, Krishnan Parthasarathi > >> >> > <kparthas@xxxxxxxxxx> wrote: > >> >> >> Emmanuel, > >> >> >> > >> >> >> Could you take statedump* of the glustershd process when it has > >> >> >> leaked > >> >> >> enough memory to be able to observe and share the output? This might > >> >> >> give us what kind of objects are we allocating abnormally high. > >> >> >> > >> >> >> * statedump of a glusterfs process > >> >> >> #kill -USR1 <pid of process> > >> >> >> > >> >> >> HTH, > >> >> >> Krish > >> >> >> > >> >> >> > >> >> >> ----- Original Message ----- > >> >> >>> On Wed, Jul 16, 2014 at 11:32:06PM -0700, Harshavardhana wrote: > >> >> >>> > On a side note while looking into this issue - I uncovered a > >> >> >>> > memory > >> >> >>> > leak too which after successful registration with glusterd, > >> >> >>> > Self-heal > >> >> >>> > daemon and NFS server are killed by FreeBSD memory manager. Have > >> >> >>> > you > >> >> >>> > observed any memory leaks? > >> >> >>> > I have the valgrind output and it clearly indicates of large > >> >> >>> > memory > >> >> >>> > leaks - perhaps it could be just FreeBSD thing! > >> >> >>> > >> >> >>> I observed memory leaks on long terme usage. My favourite test case > >> >> >>> is building NetBSD on a replicated/distributed volume, and I can > >> >> >>> see > >> >> >>> processes growing a lot during the build. I reported it some time > >> >> >>> ago, > >> >> >>> and some leaks were plugged, but obviosuly some remain. > >> >> >>> > >> >> >>> valgrind was never ported to NetBSD, hence I lack investigative > >> >> >>> tools, > >> >> >>> but I bet the leaks exist on FreeBSD and Linux as well. > >> >> >>> > >> >> >>> -- > >> >> >>> Emmanuel Dreyfus > >> >> >>> manu@xxxxxxxxxx > >> >> >>> _______________________________________________ > >> >> >>> Gluster-devel mailing list > >> >> >>> Gluster-devel@xxxxxxxxxxx > >> >> >>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel > >> >> >>> > >> >> > > >> >> > > >> >> > > >> >> > -- > >> >> > Religious confuse piety with mere ritual, the virtuous confuse > >> >> > regulation with outcomes > >> >> > >> >> > >> >> > >> >> -- > >> >> Religious confuse piety with mere ritual, the virtuous confuse > >> >> regulation with outcomes > >> >> > >> > >> > >> > >> -- > >> Religious confuse piety with mere ritual, the virtuous confuse > >> regulation with outcomes > >> > > > > -- > Religious confuse piety with mere ritual, the virtuous confuse > regulation with outcomes > _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-devel