Re: Run away memory with gluster mount

Raghavendra Gowdappa <rgowdapp@xxxxxxxxxx> · Mon, 29 Jan 2018 02:36:20 -0500 (EST)



----- Original Message -----
> From: "Ravishankar N" <ravishankar@xxxxxxxxxx>
> To: "Dan Ragle" <daniel@xxxxxxxxxxxxxx>, gluster-users@xxxxxxxxxxx
> Cc: "Csaba Henk" <chenk@xxxxxxxxxx>, "Niels de Vos" <ndevos@xxxxxxxxxx>, "Nithya Balachandran" <nbalacha@xxxxxxxxxx>,
> "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx>
> Sent: Saturday, January 27, 2018 10:23:38 AM
> Subject: Re:  Run away memory with gluster mount
> 
> 
> 
> On 01/27/2018 02:29 AM, Dan Ragle wrote:
> >
> > On 1/25/2018 8:21 PM, Ravishankar N wrote:
> >>
> >>
> >> On 01/25/2018 11:04 PM, Dan Ragle wrote:
> >>> *sigh* trying again to correct formatting ... apologize for the
> >>> earlier mess.
> >>>
> >>> Having a memory issue with Gluster 3.12.4 and not sure how to
> >>> troubleshoot. I don't *think* this is expected behavior.
> >>>
> >>> This is on an updated CentOS 7 box. The setup is a simple two node
> >>> replicated layout where the two nodes act as both server and
> >>> client.
> >>>
> >>> The volume in question:
> >>>
> >>> Volume Name: GlusterWWW
> >>> Type: Replicate
> >>> Volume ID: 8e9b0e79-f309-4d9b-a5bb-45d065faaaa3
> >>> Status: Started
> >>> Snapshot Count: 0
> >>> Number of Bricks: 1 x 2 = 2
> >>> Transport-type: tcp
> >>> Bricks:
> >>> Brick1: vs1dlan.mydomain.com:/glusterfs_bricks/brick1/www
> >>> Brick2: vs2dlan.mydomain.com:/glusterfs_bricks/brick1/www
> >>> Options Reconfigured:
> >>> nfs.disable: on
> >>> cluster.favorite-child-policy: mtime
> >>> transport.address-family: inet
> >>>
> >>> I had some other performance options in there, (increased
> >>> cache-size, md invalidation, etc) but stripped them out in an
> >>> attempt to
> >>> isolate the issue. Still got the problem without them.
> >>>
> >>> The volume currently contains over 1M files.
> >>>
> >>> When mounting the volume, I get (among other things) a process as such:
> >>>
> >>> /usr/sbin/glusterfs --volfile-server=localhost
> >>> --volfile-id=/GlusterWWW /var/www
> >>>
> >>> This process begins with little memory, but then as files are
> >>> accessed in the volume the memory increases. I setup a script that
> >>> simply reads the files in the volume one at a time (no writes). It's
> >>> been running on and off about 12 hours now and the resident
> >>> memory of the above process is already at 7.5G and continues to grow
> >>> slowly. If I stop the test script the memory stops growing,
> >>> but does not reduce. Restart the test script and the memory begins
> >>> slowly growing again.
> >>>
> >>> This is obviously a contrived app environment. With my intended
> >>> application load it takes about a week or so for the memory to get
> >>> high enough to invoke the oom killer.
> >>
> >> Can you try debugging with the statedump
> >> (https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/#read-a-statedump)
> >> of
> >> the fuse mount process and see what member is leaking? Take the
> >> statedumps in succession, maybe once initially during the I/O and
> >> once the memory gets high enough to hit the OOM mark.
> >> Share the dumps here.
> >>
> >> Regards,
> >> Ravi
> >
> > Thanks for the reply. I noticed yesterday that an update (3.12.5) had
> > been posted so I went ahead and updated and repeated the test
> > overnight. The memory usage does not appear to be growing as quickly
> > as is was with 3.12.4, but does still appear to be growing.
> >
> > I should also mention that there is another process beyond my test app
> > that is reading the files from the volume. Specifically, there is an
> > rsync that runs from the second node 2-4 times an hour that reads from
> > the GlusterWWW volume mounted on node 1. Since none of the files in
> > that mount are changing it doesn't actually rsync anything, but
> > nonetheless it is running and reading the files in addition to my test
> > script. (It's a part of my intended production setup that I forgot was
> > still running.)
> >
> > The mount process appears to be gaining memory at a rate of about 1GB
> > every 4 hours or so. At that rate it'll take several days before it
> > runs the box out of memory. But I took your suggestion and made some
> > statedumps today anyway, about 2 hours apart, 4 total so far. It looks
> > like there may already be some actionable information. These are the
> > only registers where the num_allocs have grown with each of the four
> > samples:
> >
> > [mount/fuse.fuse - usage-type gf_fuse_mt_gids_t memusage]
> >  ---> num_allocs at Fri Jan 26 08:57:31 2018: 784
> >  ---> num_allocs at Fri Jan 26 10:55:50 2018: 831
> >  ---> num_allocs at Fri Jan 26 12:55:15 2018: 877
> >  ---> num_allocs at Fri Jan 26 14:58:27 2018: 908
> >
> > [mount/fuse.fuse - usage-type gf_common_mt_fd_lk_ctx_t memusage]
> >  ---> num_allocs at Fri Jan 26 08:57:31 2018: 5
> >  ---> num_allocs at Fri Jan 26 10:55:50 2018: 10
> >  ---> num_allocs at Fri Jan 26 12:55:15 2018: 15
> >  ---> num_allocs at Fri Jan 26 14:58:27 2018: 17
> >
> > [cluster/distribute.GlusterWWW-dht - usage-type gf_dht_mt_dht_layout_t
> > memusage]
> >  ---> num_allocs at Fri Jan 26 08:57:31 2018: 24243596
> >  ---> num_allocs at Fri Jan 26 10:55:50 2018: 27902622
> >  ---> num_allocs at Fri Jan 26 12:55:15 2018: 30678066
> >  ---> num_allocs at Fri Jan 26 14:58:27 2018: 33801036
> >
> > Not sure the best way to get you the full dumps. They're pretty big,
> > over 1G for all four. Also, I noticed some filepath information in
> > there that I'd rather not share. What's the recommended next step?

Please run the following query on statedump files and report us the results:
# grep itable <client-statedump> | grep active | wc -l
# grep itable <client-statedump> | grep active_size
# grep itable <client-statedump> | grep lru | wc -l
# grep itable <client-statedump> | grep lru_size
# grep itable <client-statedump> | grep purge | wc -l
# grep itable <client-statedump> | grep purge_size

> 
> I've CC'd the fuse/ dht devs to see if these data types have potential
> leaks. Could you raise a bug with the volume info and a (dropbox?) link
> from which we can download the dumps? You can remove/replace the
> filepaths from them.
> 
> Regards.
> Ravi
> 
> >
> > Cheers!
> >
> > Dan
> >
> >>>
> >>> Is there potentially something misconfigured here?
> >>>
> >>> I did see a reference to a memory leak in another thread in this
> >>> list, but that had to do with the setting of quotas, I don't have
> >>> any quotas set on my system.
> >>>
> >>> Thanks,
> >>>
> >>> Dan Ragle
> >>> daniel@xxxxxxxxxxxxxx
> >>>
> >>> On 1/25/2018 11:04 AM, Dan Ragle wrote:
> >>>> Having a memory issue with Gluster 3.12.4 and not sure how to
> >>>> troubleshoot. I don't *think* this is expected behavior. This is on an
> >>>> updated CentOS 7 box. The setup is a simple two node replicated layout
> >>>> where the two nodes act as both server and client. The volume in
> >>>> question: Volume Name: GlusterWWW Type: Replicate Volume ID:
> >>>> 8e9b0e79-f309-4d9b-a5bb-45d065faaaa3 Status: Started Snapshot Count: 0
> >>>> Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1:
> >>>> vs1dlan.mydomain.com:/glusterfs_bricks/brick1/www Brick2:
> >>>> vs2dlan.mydomain.com:/glusterfs_bricks/brick1/www Options
> >>>> Reconfigured:
> >>>> nfs.disable: on cluster.favorite-child-policy: mtime
> >>>> transport.address-family: inet I had some other performance options in
> >>>> there, (increased cache-size, md invalidation, etc) but stripped them
> >>>> out in an attempt to isolate the issue. Still got the problem without
> >>>> them. The volume currently contains over 1M files. When mounting the
> >>>> volume, I get (among other things) a process as such:
> >>>> /usr/sbin/glusterfs --volfile-server=localhost
> >>>> --volfile-id=/GlusterWWW
> >>>> /var/www This process begins with little memory, but then as files are
> >>>> accessed in the volume the memory increases. I setup a script that
> >>>> simply reads the files in the volume one at a time (no writes). It's
> >>>> been running on and off about 12 hours now and the resident memory of
> >>>> the above process is already at 7.5G and continues to grow slowly.
> >>>> If I
> >>>> stop the test script the memory stops growing, but does not reduce.
> >>>> Restart the test script and the memory begins slowly growing again.
> >>>> This
> >>>> is obviously a contrived app environment. With my intended application
> >>>> load it takes about a week or so for the memory to get high enough to
> >>>> invoke the oom killer. Is there potentially something misconfigured
> >>>> here? Thanks, Dan Ragle daniel@xxxxxxxxxxxxxx
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Gluster-users mailing list
> >>>> Gluster-users@xxxxxxxxxxx
> >>>> http://lists.gluster.org/mailman/listinfo/gluster-users
> >>>>
> >>> _______________________________________________
> >>> Gluster-users mailing list
> >>> Gluster-users@xxxxxxxxxxx
> >>> http://lists.gluster.org/mailman/listinfo/gluster-users
> >>
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users@xxxxxxxxxxx
> > http://lists.gluster.org/mailman/listinfo/gluster-users
> 
> 
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users