memory leak in 2.0.0rc7

lslusser at gmail.com (Liam Slusser) · Thu, 16 Apr 2009 11:46:56 -0700

Good afternoon gluster-users.

The glusterd memory footprint on one of the two (always the same one, the
first in the cluster - server01) servers keeps growing.  The performance of
the cluster gets slower as the memory usage gets larger.  The load average
on the server also increases and by the time the server process is using
100meg of resident memory (showed by top) its almost using 100% of the cpu.
 Both server processes start out using about 9megs of memory.

Both backend server are identical with identical configuration files
(with descriptor changes).  Both are an identical build of Centos 5.3 with
a kernel version of 2.6.18-128.1.6.el5.  We have 3 XFS partitions of 30TB in
size on each server (90TB total on each server).  Simple raid1 setup between
the two servers.  Same hardware, same software, everything is identical.

Restarting the glusterd process every 12 hours seems silly, anybody have a
better solution?

thanks,
liam

server1 glusterd.vol (this is the server with the memory leak):

volume intstore01a
  type storage/posix
  option directory /intstore/intstore01a/gdata
end-volume

volume intstore01b
  type storage/posix
  option directory /intstore/intstore01b/gdata
end-volume

volume intstore01c
  type storage/posix
  option directory /intstore/intstore01c/gdata
end-volume

volume locksa
  type features/posix-locks
  option mandatory-locks on
  subvolumes intstore01a
end-volume

volume locksb
  type features/posix-locks
  option mandatory-locks on
  subvolumes intstore01b
end-volume

volume locksc
  type features/posix-locks
  option mandatory-locks on
  subvolumes intstore01c
end-volume

volume brick1a
  type performance/io-threads
  option thread-count 8
  subvolumes locksa
end-volume

volume brick1b
  type performance/io-threads
  option thread-count 8
  subvolumes locksb
end-volume

volume brick1c
  type performance/io-threads
  option thread-count 8
  subvolumes locksc
end-volume

volume server
  type protocol/server
  option transport-type tcp
  option auth.addr.brick1a.allow x.x.x.*
  option auth.addr.brick1b.allow x.x.x.*
  option auth.addr.brick1c.allow x.x.x.*
  subvolumes brick1a brick1b brick1c
end-volume

server2 glusterd.vol (no memory leak):

volume intstore02a
  type storage/posix
  option directory /intstore/intstore02a/gdata
end-volume

volume intstore02b
  type storage/posix
  option directory /intstore/intstore02b/gdata
end-volume

volume intstore02c
  type storage/posix
  option directory /intstore/intstore02c/gdata
end-volume

volume locksa
  type features/posix-locks
  option mandatory-locks on
  subvolumes intstore02a
end-volume

volume locksb
  type features/posix-locks
  option mandatory-locks on
  subvolumes intstore02b
end-volume

volume locksc
  type features/posix-locks
  option mandatory-locks on
  subvolumes intstore02c
end-volume

volume brick2a
  type performance/io-threads
  option thread-count 8
  subvolumes locksa
end-volume

volume brick2b
  type performance/io-threads
  option thread-count 8
  subvolumes locksb
end-volume

volume brick2c
  type performance/io-threads
  option thread-count 8
  subvolumes locksc
end-volume

volume server
  type protocol/server
  option transport-type tcp
  option auth.addr.brick2a.allow x.x.x.*
  option auth.addr.brick2b.allow x.x.x.*
  option auth.addr.brick2c.allow x.x.x.*
  subvolumes brick2a brick2b brick2c
end-volume

We have only two clients, both are Centos 5.3 clients and both have the
exact gluster.vol configuration file.

client volume file:

volume brick1a
  type protocol/client
  option transport-type tcp
  option remote-host server01
  option remote-subvolume brick1a
end-volume

volume brick1b
  type protocol/client
  option transport-type tcp
  option remote-host server01
  option remote-subvolume brick1b
end-volume

volume brick1c
  type protocol/client
  option transport-type tcp
  option remote-host server01
  option remote-subvolume brick1c
end-volume

volume brick2a
  type protocol/client
  option transport-type tcp
  option remote-host server02
  option remote-subvolume brick2a
end-volume

volume brick2b
  type protocol/client
  option transport-type tcp
  option remote-host server02
  option remote-subvolume brick2b
end-volume

volume brick2c
  type protocol/client
  option transport-type tcp
  option remote-host server02
  option remote-subvolume brick2c
end-volume

volume bricks1
  type cluster/replicate
  subvolumes brick1a brick2a
end-volume

volume bricks2
  type cluster/replicate
  subvolumes brick1b brick2b
end-volume

volume bricks3
  type cluster/replicate
  subvolumes brick1c brick2c
end-volume

volume distribute
  type cluster/distribute
  subvolumes bricks1 bricks2 bricks3
end-volume

volume writebehind
  type performance/write-behind
  option block-size 1MB
  option cache-size 64MB
  option flush-behind on
  subvolumes distribute
end-volume

volume cache
  type performance/io-cache
  option cache-size 2048MB
  subvolumes writebehind
end-volume
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zresearch.com/pipermail/gluster-users/attachments/20090416/6e17e1b4/attachment-0001.htm>