Gluster 3.12.1 OOM issue with volume which stores large amount of files

Jyri Palis <jyri.palis@xxxxxxxxx> · Wed, 1 Nov 2017 21:32:02 +0200

Hi,
I have been struggling with this OOM issue and so far nothing has helped.
So we are running 10TB archive volume which stores bit more than 7M files.
The problem is that due to the way we are managing this archive, we are forced to run daily "full scans" of file system to discover new uncompressed files. I know, i know, this is not optimal solution but it is as it is right now. So this scan causes glusterfsd to grow it's memory usage till kernels OOM protection kills this process. 

Each brick host has 20GB of RAM

Volume Name: logarch
Type: Distributed-Replicate
Volume ID: f5c109a0-704b-411b-9a76-be16f5e936db
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: lfs-n1-vdc1:/data/logarch/brick1/brick
Brick2: lfs-n1-vdc2:/data/logarch/brick1/brick
Brick3: lfs-n1-vdc1:/data/logarch/brick2/brick
Brick4: lfs-n1-vdc2:/data/logarch/brick2/brick
Brick5: lfs-n1-vdc1:/data/logarch/brick3/brick
Brick6: lfs-n1-vdc2:/data/logarch/brick3/brick
Options Reconfigured:
cluster.halo-enabled: yes
auth.allow: 10.10.10.1, 10.10.10.2
network.inode-lru-limit: 1000000
performance.md-cache-timeout: 60
performance.cache-invalidation: on
performance.cache-samba-metadata: off
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
performance.client-io-threads: on
cluster.shd-max-threads: 4
performance.parallel-readdir: off
transport.address-family: inet
nfs.disable: on
cluster.lookup-optimize: on
server.event-threads: 16
client.event-threads: 16
server.manage-gids: off
performance.nl-cache: on
performance.rda-cache-limit: 64MB
features.bitrot: on
features.scrub: Active
cluster.brick-multiplex: on

PS. All affected clients use FUSE type mounts.

J.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users