Memory usage high on server sides

tejas at gluster.com (Tejas N. Bhise) · Wed, 14 Apr 2010 21:03:10 -0600 (CST)

Hi Chris,

I would like your help in debugging this further. To start with, I would
like to get the system information and the test information.

You mentioned you are copying data from your old system to the new system.
The new system has 3 servers. 

Problems you saw - 

1) High memory usage on client where gluster volume is mounted
2) High memory usage on server
3) 2 days to copy 300 GB data

Is that a correct summary of the problems you saw ?

About the config, can you provide the following for both old and new systems -

1) OS and kernel level on gluster servers and clients
2) volume file from servers and clients
3) Filesystem type of backend gluster subvolumes
4) How close to full the backend subvolumes are
5) The exact copy command .. did you mount the volumes from
old and new system on a single machine and did cp or used rsync
or some other method ? If something more than just a cp, please
send the exact command line you used.
6) How many files/directories ( tentative ) in that 300GB data ( would help in 
trying to reproduce inhouse with a smaller test bed ).
7) Was there other load on the new or old system ?
8) Any other patterns you noticed.

Thanks a lot for helping to debug the problem.

Regards,
Tejas.

----- Original Message -----
From: "Chris Jin" <chris at pikicentral.com>
To: "Krzysztof Strasburger" <strasbur at chkw386.ch.pwr.wroc.pl>
Cc: "gluster-users" <gluster-users at gluster.org>
Sent: Thursday, April 15, 2010 7:52:35 AM
Subject: Re: Memory usage high on server sides

Hi Krzysztof,
Thanks for your replies. And you are right, the server process should be
glusterfsd. But I did mean servers. After two days copying, the two
processes took almost 70% of the total memory. I am just thinking one
more process will bring our servers down.

$ps auxf
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     26472  2.2 29.1 718100 600260 ?       Ssl  Apr09 184:09
glusterfsd -f /etc/glusterfs/servers/r2/f1.vol
root     26485  1.8 39.8 887744 821384 ?       Ssl  Apr09 157:16
glusterfsd -f /etc/glusterfs/servers/r2/f2.vol

At the meantime, the client side seems OK.

$ps auxf
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     19692  1.3  0.0 262148  6980 ?        Ssl  Apr12
61:33 /sbin/glusterfs --log-level=NORMAL
--volfile=/u2/git/modules/shared/glusterfs/clients/r2/c2.vol /gfs/r2/f2

Any ideas?

On Wed, 2010-04-14 at 10:16 +0200, Krzysztof Strasburger wrote:
> On Wed, Apr 14, 2010 at 06:33:15AM +0200, Krzysztof Strasburger wrote:
> > On Wed, Apr 14, 2010 at 09:22:09AM +1000, Chris Jin wrote:
> > > Hi, I got one more test today. The copying has already run for 24 hours
> > > and the memory usage is about 800MB, 39.4% of the total. But there is no
> > > external IP connection error. Is this a memory leak?
> > Seems to be, and a very persistent one. Present in glusterfs at least 
> > since version 1.3 (the oldest I used).
> > Krzysztof
> I corrected the subject, as the memory usage is high on the client side
> (glusterfs is the client process, glusterfsd is the server and it never
> used that lot of memory on my site).
> I did some more tests with logging. Accordingly to my old valgrind report,
> huge amounts of memory were still in use at exit, and these were allocated
> in __inode_create and __dentry_create. So I added log points in these functions
> and performed the "du test", ie. mounted the glusterfs directory containing
> a large number of files with log level set to TRACE , ran du on it,
> then echo 3 > /proc/sys/vm/drop_caches, waiting a while until the log file
> stopped growing, finally umounted and checked the (huge) logfile:
> prkom13:~# grep inode_create /var/log/glusterfs/root-loop-test.log |wc -l
> 151317
> prkom13:~# grep inode_destroy /var/log/glusterfs/root-loop-test.log |wc -l
> 151316
> prkom13:~# grep dentry_create /var/log/glusterfs/root-loop-test.log |wc -l
> 158688
> prkom13:~# grep dentry_unset /var/log/glusterfs/root-loop-test.log |wc -l
> 158688
> 
> Do you see? Everything seems to be OK, a number of inodes created, 1 less
> destroyed (probably the root inode), same number of dentries created and
> destroyed. The memory should be freed (there are calls to free in inode_destroy
> and dentry_unset functions), but it is not. Any ideas, what is going on?
> Glusterfs developers - is something kept in the lists, where inodes
> and dentries live, and interleaved with these inodes and entries, so that
> no memory page can be unmapped? 
> We should also look at the kernel - why it does not send forgets immediately,
> even with drop_caches=3?
> Krzysztof
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> 

_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users