glusterFS memory leak?

raghavendra at gluster.com (Raghavendra G) · Thu, 24 Dec 2009 05:47:59 +0400

what operations are you doing on mount point? how easy it is to reproduce
memory leak?

On Thu, Dec 24, 2009 at 5:08 AM, Larry Bates <larry.bates at vitalesafe.com>wrote:

>  I restarted the mirror and once again the glusterfs client process is
> just growing.  Echoing the 3 to /proc/sys/vm/drop_caches eems to shriink the
> memory footprint but only by a small amount.
>
> BTW - it is the client process that seems to grow infinitely.  Note: I have
> one machine that is acting as both a client and a server (which is where the
> client process is growing) and another machine that is a dedicated GFS
> server.
>
>
>
> Here are my configs:
>
>
>
> Client:
>
>
>
> #
>
> # Add client feature and attach to remote subvolumes of gfs001
>
> #
>
> volume gfs001brick1
>
>   type protocol/client
>
>   option transport-type tcp            # for TCP/IP transport
>
>   option remote-host gfs001            # IP address of the remote volume
>
>   option remote-subvolume vol1         # name of the remote volume
>
>   #option transport.socket.nodelay on   # undocumented option for speed
>
> end-volume
>
>
>
> volume gfs001brick2
>
>   type protocol/client
>
>   option transport-type tcp            # for TCP/IP transport
>
>   option remote-host gfs001            # IP address of the remote volume
>
>   option remote-subvolume vol2         # name of the remote volume
>
>   #option transport.socket.nodelay on   # undocumented option for speed
>
> end-volume
>
>
>
> volume coraidbrick13
>
>   type protocol/client
>
>   option transport-type tcp
>
>   option remote-host 10.0.0.71
>
>   option remote-subvolume coraidvol13
>
>   #option transport.socket.nodelay on
>
> end-volume
>
>
>
> volume coraidbrick14
>
>   type protocol/client
>
>   option transport-type tcp
>
>   option remote-host 10.0.0.71
>
>   option remote-subvolume coraidvol14
>
>   #option transport.socket.nodelay on
>
> end-volume
>
>
>
> #
>
> # Replicate volumes
>
> #
>
> volume afr-vol1
>
>   type cluster/replicate
>
>   subvolumes gfs001brick1 coraidbrick13
>
> end-volume
>
>
>
> volume afr-vol2
>
>   type cluster/replicate
>
>   subvolumes gfs001brick2 coraidbrick14
>
> end-volume
>
>
>
> #
>
> # Distribute files across bricks
>
> #
>
> volume dht-vol
>
>   type cluster/distribute
>
>   subvolumes afr-vol1 afr-vol2
>
>   option min-free-disk 2%               # 2% of 1.8Tb volumes is 36Gb
>
> end-volume
>
>
>
> #
>
> # Add quick-read for small files
>
> #
>
> volume quickread
>
>   type performance/quick-read
>
>   option cache-timeout 1         # default 1 second
>
>   option max-file-size 256KB     # default 64Kb
>
>   subvolumes dht-vol
>
> end-volume
>
>
>
> Server:
>
>
>
> volume coraidbrick13
>
>   type storage/posix                          # POSIX FS translator
>
>   option directory /mnt/glusterfs/e101.13     # Export this directory
>
>   option background-unlink yes                # unlink in background
>
> end-volume
>
>
>
> volume coraidbrick14
>
>   type storage/posix
>
>   option directory /mnt/glusterfs/e101.14
>
>   option background-unlink yes
>
> end-volume
>
>
>
> volume iot1
>
>   type performance/io-threads
>
>   option thread-count 4
>
>   subvolumes coraidbrick13
>
> end-volume
>
>
>
> volume iot2
>
>   type performance/io-threads
>
>   option thread-count 4
>
>   subvolumes coraidbrick14
>
> end-volume
>
>
>
> volume coraidvol13
>
>   type features/locks
>
>   subvolumes iot1
>
> end-volume
>
>
>
> volume coraidvol14
>
>   type features/locks
>
>   subvolumes iot2
>
> end-volume
>
>
>
> ## Add network serving capability to volumes
>
> volume server
>
>   type protocol/server
>
>   option transport-type tcp                    # For TCP/IP transport
>
>   subvolumes coraidvol13 coraidvol14           # Expose both volumes
>
>   option auth.addr.coraidvol13.allow 10.0.0.*
>
>   option auth.addr.coraidvol14.allow 10.0.0.*
>
> end-volume
>
>
>
> Thanks,
>
> Larry
>
>
>
>
>
>
>
> *From:* raghavendra.hg at gmail.com [mailto:raghavendra.hg at gmail.com] *On
> Behalf Of *Raghavendra G
> *Sent:* Wednesday, December 23, 2009 2:43 PM
> *To:* Larry Bates
> *Cc:* gluster-users
> *Subject:* Re: glusterFS memory leak?
>
>
>
> Hi Larry,
>
> What is the client and server configuration? Instead of killing glusterfs
> process, can you do "echo 3 > /proc/sys/vm/drop_caches" and check whether
> memory usage comes down?
>
> regards,
>
> On Wed, Dec 23, 2009 at 9:01 PM, Larry Bates <larry.bates at vitalesafe.com>
> wrote:
>
> I've successfully set up GlusterFS 3.0 with a single server.  I brought on
> a 2nd server and setup AFR and have been working through the mirroring
> process.  I started an "ls -alR" to trigger a complete mirror of all files
> between the two servers.  After running for about 16 hours I started getting
> kernel out of memory errors.  Looking at top I see:
>
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  2561 root      15   0 13.0g 8.9g  860 S    7 91.6  60:56.02 glusterfs
>
> Seems that the client has used up all my memory (RAM and SWAP).  Killing
> the process returned all the memory to the OS.
>
> BTW - I'm working on a 2.3Tb store that contains about 2.5 million files in
> 65K folders.
>
> Thoughts?
>
> Larry Bates
> vitalEsafe, Inc.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
>
>
>
> --
> Raghavendra G
>

-- 
Raghavendra G