Hi all,
We have 500GB and 10TB 4x1 replicate xfs-based gluster volumes, and the 10TB one especially is extremely slow to do certain things with (and has been since gluster 3.x when we started). We're currently on 5.13.
The number of files isn't even what I'd consider that great - under 100k per dir.
Here are some numbers to look at:
On gluster volume in a dir of 45k files:
The first time
time find | wc -l45423real 8m44.819suser 0m0.459ssys 0m0.998s
And again
time find | wc -l45423real 0m34.677suser 0m0.291ssys 0m0.754s
If I run the same operation on the xfs block device itself:
The first time
time find | wc -l45423real 0m13.514suser 0m0.144ssys 0m0.501s
And again
time find | wc -l45423real 0m0.197suser 0m0.088ssys 0m0.106s
I'd expect a performance difference here but just as it was several years ago when we started with gluster, it's still huge, and simple file listings are incredibly slow.
At the time, the team was looking to do some optimizations, but I'm not sure this has happened.
What can we do to try to improve performance?
Thank you.
Some setup values follow.
xfs_info /mnt/SNIP_block1
meta-data="" isize=512 agcount=103, agsize=26214400 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=0, rmapbt=0
= reflink=0
data = bsize=4096 blocks=2684354560, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=51200, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
meta-data="" isize=512 agcount=103, agsize=26214400 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=0, rmapbt=0
= reflink=0
data = bsize=4096 blocks=2684354560, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=51200, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Volume Name: SNIP_data1
Type: Replicate
Volume ID: SNIP
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: nexus2:/mnt/SNIP_block1/SNIP_data1
Brick2: forge:/mnt/SNIP_block1/SNIP_data1
Brick3: hive:/mnt/SNIP_block1/SNIP_data1
Brick4: citadel:/mnt/SNIP_block1/SNIP_data1
Options Reconfigured:
cluster.quorum-count: 1
cluster.quorum-type: fixed
network.ping-timeout: 5
network.remote-dio: enable
performance.rda-cache-limit: 256MB
performance.readdir-ahead: on
performance.parallel-readdir: on
network.inode-lru-limit: 500000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
cluster.readdir-optimize: on
performance.io-thread-count: 32
server.event-threads: 4
client.event-threads: 4
performance.read-ahead: off
cluster.lookup-optimize: on
performance.cache-size: 1GB
cluster.self-heal-daemon: enable
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on
cluster.granular-entry-heal: enable
cluster.data-self-heal-algorithm: full
Type: Replicate
Volume ID: SNIP
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: nexus2:/mnt/SNIP_block1/SNIP_data1
Brick2: forge:/mnt/SNIP_block1/SNIP_data1
Brick3: hive:/mnt/SNIP_block1/SNIP_data1
Brick4: citadel:/mnt/SNIP_block1/SNIP_data1
Options Reconfigured:
cluster.quorum-count: 1
cluster.quorum-type: fixed
network.ping-timeout: 5
network.remote-dio: enable
performance.rda-cache-limit: 256MB
performance.readdir-ahead: on
performance.parallel-readdir: on
network.inode-lru-limit: 500000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
cluster.readdir-optimize: on
performance.io-thread-count: 32
server.event-threads: 4
client.event-threads: 4
performance.read-ahead: off
cluster.lookup-optimize: on
performance.cache-size: 1GB
cluster.self-heal-daemon: enable
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on
cluster.granular-entry-heal: enable
cluster.data-self-heal-algorithm: full
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users