Re: Gluster eating up a lot of ram

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Will this kill the actual process or simply trigger the dump? Which process should I kill? The brick process in the system or the fuse mount?

Diego

On Mon, Jul 29, 2019, 23:27 Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:


On Tue, 30 Jul 2019 at 05:44, Diego Remolina <dijuremo@xxxxxxxxx> wrote:
Unfortunately statedump crashes on both machines, even freshly rebooted.

Do you see any statedump files in /var/run/gluster?  This looks more like the gluster cli crashed. 

[root@ysmha01 ~]# gluster --print-statedumpdir
/var/run/gluster
[root@ysmha01 ~]# gluster v statedump export
Segmentation fault (core dumped)

[root@ysmha02 ~]# uptime
 20:12:20 up 6 min,  1 user,  load average: 0.72, 0.52, 0.24
[root@ysmha02 ~]# gluster --print-statedumpdir
/var/run/gluster
[root@ysmha02 ~]# gluster v statedump export
Segmentation fault (core dumped)

I rebooted today after 40 days. Gluster was eating up shy of 40GB of RAM out of 64.

What would you recommend to be the next step?

Diego

On Mon, Mar 4, 2019 at 5:07 AM Poornima Gurusiddaiah <pgurusid@xxxxxxxxxx> wrote:
Could you also provide the statedump of the gluster process consuming 44G ram [1]. Please make sure the statedump is taken when the memory consumption is very high, like 10s of GBs, otherwise we may not be able to identify the issue. Also i see that the cache size is 10G is that something you arrived at, after doing some tests? Its relatively higher than normal.


On Mon, Mar 4, 2019 at 12:23 AM Diego Remolina <dijuremo@xxxxxxxxx> wrote:
Hi,

I will not be able to test gluster-6rc because this is a production environment and it takes several days for memory to grow a lot.

The Samba server is hosting all types of files, small and large from small roaming profile type files to bigger files like adobe suite, autodesk Revit (file sizes in the hundreds of megabytes).

As I stated before, this same issue was present back with 3.8.x which I was running before.

The information you requested:

[root@ysmha02 ~]# gluster v info export

Volume Name: export
Type: Replicate
Volume ID: b4353b3f-6ef6-4813-819a-8e85e5a95cff
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.0.1.7:/bricks/hdds/brick
Brick2: 10.0.1.6:/bricks/hdds/brick
Options Reconfigured:
performance.stat-prefetch: on
performance.cache-min-file-size: 0
network.inode-lru-limit: 65536
performance.cache-invalidation: on
features.cache-invalidation: on
performance.md-cache-timeout: 600
features.cache-invalidation-timeout: 600
performance.cache-samba-metadata: on
transport.address-family: inet
server.allow-insecure: on
performance.cache-size: 10GB
cluster.server-quorum-type: server
nfs.disable: on
performance.io-thread-count: 64
performance.io-cache: on
cluster.lookup-optimize: on
cluster.readdir-optimize: on
server.event-threads: 5
client.event-threads: 5
performance.cache-max-file-size: 256MB
diagnostics.client-log-level: INFO
diagnostics.brick-log-level: INFO
cluster.server-quorum-ratio: 51%






Virus-free. www.avast.com

On Fri, Mar 1, 2019 at 11:07 PM Poornima Gurusiddaiah <pgurusid@xxxxxxxxxx> wrote:
This high memory consumption is not normal. Looks like it's a memory leak. Is it possible to try it on test setup with gluster-6rc? What is the kind of workload that goes into fuse mount? Large files or small files? We need the following information to debug further: 
- Gluster volume info output
- Statedump of the Gluster fuse mount process consuming 44G ram.

Regards,
Poornima


On Sat, Mar 2, 2019, 3:40 AM Diego Remolina <dijuremo@xxxxxxxxx> wrote:
I am using glusterfs with two servers as a file server sharing files via samba and ctdb. I cannot use samba vfs gluster plugin, due to bug in current Centos version of samba. So I am mounting via fuse and exporting the volume to samba from the mount point.

Upon initial boot, the server where samba is exporting files climbs up to ~10GB RAM within a couple hours of use. From then on, it is a constant slow memory increase. In the past with gluster 3.8.x we had to reboot the servers at around 30 days . With gluster 4.1.6 we are getting up to 48 days, but RAM use is at 48GB out of 64GB. Is this normal?

The particular versions are below,

[root@ysmha01 home]# uptime
16:59:39 up 48 days,  9:56,  1 user,  load average: 3.75, 3.17, 3.00
[root@ysmha01 home]# rpm -qa | grep gluster
centos-release-gluster41-1.0-3.el7.centos.noarch
glusterfs-server-4.1.6-1.el7.x86_64
glusterfs-api-4.1.6-1.el7.x86_64
centos-release-gluster-legacy-4.0-2.el7.centos.noarch
glusterfs-4.1.6-1.el7.x86_64
glusterfs-client-xlators-4.1.6-1.el7.x86_64
libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.8.x86_64
glusterfs-fuse-4.1.6-1.el7.x86_64
glusterfs-libs-4.1.6-1.el7.x86_64
glusterfs-rdma-4.1.6-1.el7.x86_64
glusterfs-cli-4.1.6-1.el7.x86_64
samba-vfs-glusterfs-4.8.3-4.el7.x86_64
[root@ysmha01 home]# rpm -qa | grep samba
samba-common-tools-4.8.3-4.el7.x86_64
samba-client-libs-4.8.3-4.el7.x86_64
samba-libs-4.8.3-4.el7.x86_64
samba-4.8.3-4.el7.x86_64
samba-common-libs-4.8.3-4.el7.x86_64
samba-common-4.8.3-4.el7.noarch
samba-vfs-glusterfs-4.8.3-4.el7.x86_64
[root@ysmha01 home]# cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)

RAM view using top
Tasks: 398 total,   1 running, 397 sleeping,   0 stopped,   0 zombie
%Cpu(s):  7.0 us,  9.3 sy,  1.7 ni, 71.6 id,  9.7 wa,  0.0 hi,  0.8 si,  0.0 st
KiB Mem : 65772000 total,  1851344 free, 60487404 used,  3433252 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  3134316 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 9953 root      20   0 3727912 946496   3196 S 150.2  1.4  38626:27 glusterfsd
 9634 root      20   0   48.1g  47.2g   3184 S  96.3 75.3  29513:55 glusterfs
14485 root      20   0 3404140  63780   2052 S  80.7  0.1   1590:13 glusterfs

[root@ysmha01 ~]# gluster v status export
Status of volume: export
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.0.1.7:/bricks/hdds/brick           49157     0          Y       13986
Brick 10.0.1.6:/bricks/hdds/brick           49153     0          Y       9953
Self-heal Daemon on localhost               N/A       N/A        Y       14485
Self-heal Daemon on 10.0.1.7                N/A       N/A        Y       21934
Self-heal Daemon on 10.0.1.5                N/A       N/A        Y       4598

Task Status of Volume export
------------------------------------------------------------------------------
There are no active volume tasks




Virus-free. www.avast.com
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux