Re: Run away memory with gluster mount

Raghavendra Gowdappa <rgowdapp@xxxxxxxxxx> · Mon, 29 Jan 2018 02:30:36 -0500 (EST)

----- Original Message -----
> From: "Nithya Balachandran" <nbalacha@xxxxxxxxxx>
> To: "Ravishankar N" <ravishankar@xxxxxxxxxx>
> Cc: "Csaba Henk" <chenk@xxxxxxxxxx>, "gluster-users" <gluster-users@xxxxxxxxxxx>
> Sent: Monday, January 29, 2018 10:49:43 AM
> Subject: Re:  Run away memory with gluster mount
> 
> Csaba,
> 
> Could this be the problem of the inodes not getting freed in the fuse
> process?

We can answer that question only after looking into statedumps. If we find too many inodes in fuse inode table's lru list (with refcount 0, lookup count > 0), it could be because sub-optimal garbage collection of inodes.

> 
> Daniel,
> as Ravi requested, please provide access to the statedumps. You can strip out
> the filepath information.
> Does your data set include a lot of directories?
> 
> 
> Thanks,
> Nithya
> 
> On 27 January 2018 at 10:23, Ravishankar N < ravishankar@xxxxxxxxxx > wrote:
> 
> 
> 
> 
> 
> On 01/27/2018 02:29 AM, Dan Ragle wrote:
> 
> 
> 
> On 1/25/2018 8:21 PM, Ravishankar N wrote:
> 
> 
> 
> 
> On 01/25/2018 11:04 PM, Dan Ragle wrote:
> 
> 
> *sigh* trying again to correct formatting ... apologize for the earlier mess.
> 
> Having a memory issue with Gluster 3.12.4 and not sure how to troubleshoot. I
> don't *think* this is expected behavior.
> 
> This is on an updated CentOS 7 box. The setup is a simple two node replicated
> layout where the two nodes act as both server and
> client.
> 
> The volume in question:
> 
> Volume Name: GlusterWWW
> Type: Replicate
> Volume ID: 8e9b0e79-f309-4d9b-a5bb-45d065faaaa3
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: vs1dlan.mydomain.com:/glusterfs_bricks/brick1/www
> Brick2: vs2dlan.mydomain.com:/glusterfs_bricks/brick1/www
> Options Reconfigured:
> nfs.disable: on
> cluster.favorite-child-policy: mtime
> transport.address-family: inet
> 
> I had some other performance options in there, (increased cache-size, md
> invalidation, etc) but stripped them out in an attempt to
> isolate the issue. Still got the problem without them.
> 
> The volume currently contains over 1M files.
> 
> When mounting the volume, I get (among other things) a process as such:
> 
> /usr/sbin/glusterfs --volfile-server=localhost --volfile-id=/GlusterWWW
> /var/www
> 
> This process begins with little memory, but then as files are accessed in the
> volume the memory increases. I setup a script that
> simply reads the files in the volume one at a time (no writes). It's been
> running on and off about 12 hours now and the resident
> memory of the above process is already at 7.5G and continues to grow slowly.
> If I stop the test script the memory stops growing,
> but does not reduce. Restart the test script and the memory begins slowly
> growing again.
> 
> This is obviously a contrived app environment. With my intended application
> load it takes about a week or so for the memory to get
> high enough to invoke the oom killer.
> 
> Can you try debugging with the statedump (
> https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/#read-a-statedump
> ) of
> the fuse mount process and see what member is leaking? Take the statedumps in
> succession, maybe once initially during the I/O and
> once the memory gets high enough to hit the OOM mark.
> Share the dumps here.
> 
> Regards,
> Ravi
> 
> Thanks for the reply. I noticed yesterday that an update (3.12.5) had been
> posted so I went ahead and updated and repeated the test overnight. The
> memory usage does not appear to be growing as quickly as is was with 3.12.4,
> but does still appear to be growing.
> 
> I should also mention that there is another process beyond my test app that
> is reading the files from the volume. Specifically, there is an rsync that
> runs from the second node 2-4 times an hour that reads from the GlusterWWW
> volume mounted on node 1. Since none of the files in that mount are changing
> it doesn't actually rsync anything, but nonetheless it is running and
> reading the files in addition to my test script. (It's a part of my intended
> production setup that I forgot was still running.)
> 
> The mount process appears to be gaining memory at a rate of about 1GB every 4
> hours or so. At that rate it'll take several days before it runs the box out
> of memory. But I took your suggestion and made some statedumps today anyway,
> about 2 hours apart, 4 total so far. It looks like there may already be some
> actionable information. These are the only registers where the num_allocs
> have grown with each of the four samples:
> 
> [mount/fuse.fuse - usage-type gf_fuse_mt_gids_t memusage]
> ---> num_allocs at Fri Jan 26 08:57:31 2018: 784
> ---> num_allocs at Fri Jan 26 10:55:50 2018: 831
> ---> num_allocs at Fri Jan 26 12:55:15 2018: 877
> ---> num_allocs at Fri Jan 26 14:58:27 2018: 908
> 
> [mount/fuse.fuse - usage-type gf_common_mt_fd_lk_ctx_t memusage]
> ---> num_allocs at Fri Jan 26 08:57:31 2018: 5
> ---> num_allocs at Fri Jan 26 10:55:50 2018: 10
> ---> num_allocs at Fri Jan 26 12:55:15 2018: 15
> ---> num_allocs at Fri Jan 26 14:58:27 2018: 17
> 
> [cluster/distribute.GlusterWWW-dht - usage-type gf_dht_mt_dht_layout_t
> memusage]
> ---> num_allocs at Fri Jan 26 08:57:31 2018: 24243596
> ---> num_allocs at Fri Jan 26 10:55:50 2018: 27902622
> ---> num_allocs at Fri Jan 26 12:55:15 2018: 30678066
> ---> num_allocs at Fri Jan 26 14:58:27 2018: 33801036
> 
> Not sure the best way to get you the full dumps. They're pretty big, over 1G
> for all four. Also, I noticed some filepath information in there that I'd
> rather not share. What's the recommended next step?
> 
> I've CC'd the fuse/ dht devs to see if these data types have potential leaks.
> Could you raise a bug with the volume info and a (dropbox?) link from which
> we can download the dumps? You can remove/replace the filepaths from them.
> 
> Regards.
> Ravi
> 
> 
> 
> 
> 
> Cheers!
> 
> Dan
> 
> 
> 
> 
> 
> 
> Is there potentially something misconfigured here?
> 
> I did see a reference to a memory leak in another thread in this list, but
> that had to do with the setting of quotas, I don't have
> any quotas set on my system.
> 
> Thanks,
> 
> Dan Ragle
> daniel@xxxxxxxxxxxxxx
> 
> On 1/25/2018 11:04 AM, Dan Ragle wrote:
> 
> 
> Having a memory issue with Gluster 3.12.4 and not sure how to
> troubleshoot. I don't *think* this is expected behavior. This is on an
> updated CentOS 7 box. The setup is a simple two node replicated layout
> where the two nodes act as both server and client. The volume in
> question: Volume Name: GlusterWWW Type: Replicate Volume ID:
> 8e9b0e79-f309-4d9b-a5bb-45d065faaaa3 Status: Started Snapshot Count: 0
> Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1:
> vs1dlan.mydomain.com:/glusterfs_bricks/brick1/www Brick2:
> vs2dlan.mydomain.com:/glusterfs_bricks/brick1/www Options Reconfigured:
> nfs.disable: on cluster.favorite-child-policy: mtime
> transport.address-family: inet I had some other performance options in
> there, (increased cache-size, md invalidation, etc) but stripped them
> out in an attempt to isolate the issue. Still got the problem without
> them. The volume currently contains over 1M files. When mounting the
> volume, I get (among other things) a process as such:
> /usr/sbin/glusterfs --volfile-server=localhost --volfile-id=/GlusterWWW
> /var/www This process begins with little memory, but then as files are
> accessed in the volume the memory increases. I setup a script that
> simply reads the files in the volume one at a time (no writes). It's
> been running on and off about 12 hours now and the resident memory of
> the above process is already at 7.5G and continues to grow slowly. If I
> stop the test script the memory stops growing, but does not reduce.
> Restart the test script and the memory begins slowly growing again. This
> is obviously a contrived app environment. With my intended application
> load it takes about a week or so for the memory to get high enough to
> invoke the oom killer. Is there potentially something misconfigured
> here? Thanks, Dan Ragle daniel@xxxxxxxxxxxxxx
> 
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://lists.gluster.org/mailman/listinfo/gluster-users
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://lists.gluster.org/mailman/listinfo/gluster-users
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://lists.gluster.org/mailman/listinfo/gluster-users
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users