This is a small memory system like 1024M and a disk space for the volume is 9gig, i do not think it has anything to do with AFR per se - same bug is also reproducible on the bricks, nfs server too. Also it might be that we aren't able to capture glusterdumps on non Linux platforms properly - one of reasons i used Valgrind output. In Valgrind it indicates about 'lost memory' blocks - You can see the screenshots too which indicate memory ramp ups in seconds with no i/o, in-fact no data on the volume. The work-around i have seen to contain this issue is to disable self-heal-deamon and NFS - after that the memory remains proper. On an interesting observation after running Gluster management daemon in debugging more - i can see that RPCLNT_CONNECT event() is constantly being triggered - which should only occur once?? per process notification? On Thu, Jul 17, 2014 at 3:38 AM, Krishnan Parthasarathi <kparthas@xxxxxxxxxx> wrote: > Harsha, > > I don't actively work on AFR, so I might have missed some things. > I looked for the following things in the statedump for any memory allocation > related oddities. > 1) grep "pool-misses" *dump* > This tells us if there were any objects whose allocated mem-pool wasn't sufficient > for the load it was working under. > I see that the pool-misses were zero, which means we are doing good with the mem-pools we allocated. > > 2) grep "hot-count" *dump* > This tells us the no. of objects of any kind that is 'active' in the process while the state-dump > was taken. This should allow us to see if the numbers we see are explicable. > I see the maximum hot-count across statedumps of processes is 50, which isn't alarming or pointing any obvious memory leaks. > > The above observations indicate that some object that is not mem-pool allocated is being leaked. > > Hope this helps, > Krish > > ----- Original Message ----- >> Here you go KP - https://bugzilla.redhat.com/show_bug.cgi?id=1120570 >> >> On Thu, Jul 17, 2014 at 12:37 AM, Krishnan Parthasarathi >> <kparthas@xxxxxxxxxx> wrote: >> > Harsha, >> > >> > In addition to the valgrind output, statedump output of glustershd process >> > when the leak is observed would be really helpful. >> > >> > thanks, >> > Krish >> > >> > ----- Original Message ----- >> >> Nope spoke too early, using poll() has no effect on the memory usage >> >> on Linux, so actually back to FreeBSD. >> >> >> >> On Thu, Jul 17, 2014 at 12:07 AM, Harshavardhana >> >> <harsha@xxxxxxxxxxxxxxxxxx> wrote: >> >> > KP, >> >> > >> >> > I do have a 3.2Gigs worth of valgrind output which indicates this >> >> > issue, trying to reproduce this on Linux. >> >> > >> >> > My hunch says that 'compiling' with --disable-epoll might actually >> >> > trigger this issue on Linux too. Will update here >> >> > once i have done that testing. >> >> > >> >> > >> >> > On Wed, Jul 16, 2014 at 11:44 PM, Krishnan Parthasarathi >> >> > <kparthas@xxxxxxxxxx> wrote: >> >> >> Emmanuel, >> >> >> >> >> >> Could you take statedump* of the glustershd process when it has leaked >> >> >> enough memory to be able to observe and share the output? This might >> >> >> give us what kind of objects are we allocating abnormally high. >> >> >> >> >> >> * statedump of a glusterfs process >> >> >> #kill -USR1 <pid of process> >> >> >> >> >> >> HTH, >> >> >> Krish >> >> >> >> >> >> >> >> >> ----- Original Message ----- >> >> >>> On Wed, Jul 16, 2014 at 11:32:06PM -0700, Harshavardhana wrote: >> >> >>> > On a side note while looking into this issue - I uncovered a memory >> >> >>> > leak too which after successful registration with glusterd, >> >> >>> > Self-heal >> >> >>> > daemon and NFS server are killed by FreeBSD memory manager. Have you >> >> >>> > observed any memory leaks? >> >> >>> > I have the valgrind output and it clearly indicates of large memory >> >> >>> > leaks - perhaps it could be just FreeBSD thing! >> >> >>> >> >> >>> I observed memory leaks on long terme usage. My favourite test case >> >> >>> is building NetBSD on a replicated/distributed volume, and I can see >> >> >>> processes growing a lot during the build. I reported it some time ago, >> >> >>> and some leaks were plugged, but obviosuly some remain. >> >> >>> >> >> >>> valgrind was never ported to NetBSD, hence I lack investigative tools, >> >> >>> but I bet the leaks exist on FreeBSD and Linux as well. >> >> >>> >> >> >>> -- >> >> >>> Emmanuel Dreyfus >> >> >>> manu@xxxxxxxxxx >> >> >>> _______________________________________________ >> >> >>> Gluster-devel mailing list >> >> >>> Gluster-devel@xxxxxxxxxxx >> >> >>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel >> >> >>> >> >> > >> >> > >> >> > >> >> > -- >> >> > Religious confuse piety with mere ritual, the virtuous confuse >> >> > regulation with outcomes >> >> >> >> >> >> >> >> -- >> >> Religious confuse piety with mere ritual, the virtuous confuse >> >> regulation with outcomes >> >> >> >> >> >> -- >> Religious confuse piety with mere ritual, the virtuous confuse >> regulation with outcomes >> -- Religious confuse piety with mere ritual, the virtuous confuse regulation with outcomes _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-devel