Re: Memory leak with a replica 3 arbiter 1 configuration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ravi,

I saw that you updated the patch today (http://review.gluster.org/#/c/15289/). I built an RPM of the first iteration you had of the patch (just changing the one line in arbiter.c "GF_FREE (ctx->iattbuf);") and am running that on some test servers now to see if the memory of the arbiter brick gets out of control.

Ben

On Tue, Aug 23, 2016 at 3:38 AM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:
Hi Benjamin

On 08/23/2016 06:41 AM, Benjamin Edgar wrote:
I've attached a statedump of the problem brick process.  Let me know if there are any other logs you need.

Thanks for the report! I've sent a fix @ http://review.gluster.org/#/c/15289/ . It would be nice if you can verify if the patch fixes the issue for you.

Thanks,
Ravi


Thanks a lot,
Ben

On Mon, Aug 22, 2016 at 5:03 PM, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote:
Could you collect statedump of the brick process by following: https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump

That should help us identify which datatype is causing leaks and fix it.

Thanks!

On Tue, Aug 23, 2016 at 2:22 AM, Benjamin Edgar <benedgar8@xxxxxxxxx> wrote:
Hi,

I appear to have a memory leak with a replica 3 arbiter 1 configuration of gluster. I have a data brick and an arbiter brick on one server, and another server with the last data brick. The more I write files to gluster in this configuration, the more memory the arbiter brick process takes up.

I am able to reproduce this issue by first setting up a replica 3 arbiter 1 configuration and then using the following bash script to create 10,000 200kB files, delete those files, and run forever:

while true ; do
  for i in {1..10000} ; do
    dd if=/dev/urandom bs=200K count=1 of=$TEST_FILES_DIR/file$i
  done
  rm -rf $TEST_FILES_DIR/*
done

$TEST_FILES_DIR is a location on my gluster mount.

After about 3 days of this script running on one of my clusters, this is what the output of "top" looks like:
  PID   USER      PR  NI    VIRT       RES        SHR S   %CPU %MEM     TIME+       COMMAND
16039 root          20   0     1397220  77720     3948 S   20.6    1.0            860:01.53  glusterfsd
13174 root          20   0     1395824  112728   3692 S   19.6    1.5            806:07.17  glusterfs
19961 root          20   0     2967204  2.145g    3896 S   17.3    29.0          752:10.70  glusterfsd

As you can see one of the brick processes is using over 2 gigabytes of memory.

One work-around for this is to kill the arbiter brick process and restart the gluster daemon. This restarts arbiter brick process and its memory usage goes back down to a reasonable level. However I would rather not kill the arbiter brick every week for production environments.

Has anyone seen this issue before and is there a known work-around/fix?

Thanks,
Ben

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users



--
Pranith



--
Benjamin Edgar
Computer Science
University of Virginia 2015
(571) 338-0878


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users





--
Benjamin Edgar
Computer Science
University of Virginia 2015
(571) 338-0878
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux