Help analise statedumps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

 

I have a 3x replicated cluster running 4.1.7 on ubuntu 16.04.5, all 3 replicas are also clients hosting a Node.js/Nginx web server.

 

The current configuration is as such:

 

Volume Name: gvol1

Type: Replicate

Volume ID: XXXXXX

Status: Started

Snapshot Count: 0

Number of Bricks: 1 x 3 = 3

Transport-type: tcp

Bricks:

Brick1: vm000000:/srv/brick1/gvol1

Brick2: vm000001:/srv/brick1/gvol1

Brick3: vm000002:/srv/brick1/gvol1

Options Reconfigured:

cluster.self-heal-readdir-size: 2KB

cluster.self-heal-window-size: 2

cluster.background-self-heal-count: 20

network.ping-timeout: 5

disperse.eager-lock: off

performance.parallel-readdir: on

performance.readdir-ahead: on

performance.rda-cache-limit: 128MB

performance.cache-refresh-timeout: 10

performance.nl-cache-timeout: 600

performance.nl-cache: on

cluster.nufa: on

performance.enable-least-priority: off

server.outstanding-rpc-limit: 128

performance.strict-o-direct: on

cluster.shd-max-threads: 12

client.event-threads: 4

cluster.lookup-optimize: on

network.inode-lru-limit: 90000

performance.md-cache-timeout: 600

performance.cache-invalidation: on

performance.cache-samba-metadata: on

performance.stat-prefetch: on

features.cache-invalidation-timeout: 600

features.cache-invalidation: on

storage.fips-mode-rchecksum: on

transport.address-family: inet

nfs.disable: on

performance.client-io-threads: on

features.utime: on

storage.ctime: on

server.event-threads: 4

performance.cache-size: 256MB

performance.read-ahead: on

cluster.readdir-optimize: on

cluster.strict-readdir: on

performance.io-thread-count: 8

server.allow-insecure: on

cluster.read-hash-mode: 0

cluster.lookup-unhashed: auto

cluster.choose-local: on

 

I believe there’s a memory leak somewhere, it just keeps going up until it hangs one or more nodes taking the whole cluster down sometimes.

 

I have taken 2 statedumps on one of the nodes, one where the memory is too high and another just after a reboot with the app running and the volume fully healed.

 

https://pmcdigital.sharepoint.com/:u:/g/EYDsNqTf1UdEuE6B0ZNVPfIBf_I-AbaqHotB1lJOnxLlTg?e=boYP09 (high memory)

 

https://pmcdigital.sharepoint.com/:u:/g/EWZBsnET2xBHl6OxO52RCfIBvQ0uIDQ1GKJZ1GrnviyMhg?e=wI3yaY  (after reboot)

 

Any help would be greatly appreciated,

 

Kindest Regards,

 

Pedro Maia Costa
Senior Developer, pmc.digital

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux