On 11/4/19 8:46 PM, David Spisla wrote:
Dear Gluster Community,
I also have a issue concerning performance. The last
days I updated our test cluster from GlusterFS v5.5 to
v7.0 . The setup in general:
2 HP DL380 Servers with 10Gbit NICs, 1
Distribute-Replica 2 Volume with 2 Replica Pairs. Client
is SMB Samba (access via vfs_glusterfs) . I did several
tests to ensure that Samba don't causes the fall.
The setup ist completely the same except the Gluster
Version
Here are my results:
64KiB 1MiB 10MiB
(Filesize)
3,49
47,41 300,50 (Values in
MiB/s with GlusterFS v5.5)
0,16
2,61 76,63 (Values
in MiB/s with GlusterFS v7.0)
Can you please share the profile information [1] for both
versions? Also it would be really helpful if you can mention the
io patterns that used for this tests.
[1] :
https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/
We
use this volume options (GlusterFS 7.0):
Volume
Name: archive1
Type: Distributed-Replicate
Volume ID: 44c17844-0bd4-4ca2-98d8-a1474add790c
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: fs-dl380-c1-n1:/gluster/brick1/glusterbrick
Brick2: fs-dl380-c1-n2:/gluster/brick1/glusterbrick
Brick3: fs-dl380-c1-n1:/gluster/brick2/glusterbrick
Brick4: fs-dl380-c1-n2:/gluster/brick2/glusterbrick
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
user.smb: disable
features.read-only: off
features.worm: off
features.worm-file-level: on
features.retention-mode: enterprise
features.default-retention-period: 120
network.ping-timeout: 10
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.nl-cache: on
performance.nl-cache-timeout: 600
client.event-threads: 32
server.event-threads: 32
cluster.lookup-optimize: on
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
performance.cache-samba-metadata: on
performance.cache-ima-xattrs: on
performance.io-thread-count: 64
cluster.use-compound-fops: on
performance.cache-size: 512MB
performance.cache-refresh-timeout: 10
performance.read-ahead: off
performance.write-behind-window-size: 4MB
performance.write-behind: on
storage.build-pgfid: on
features.ctime: on
cluster.quorum-type: fixed
cluster.quorum-count: 1
features.bitrot: on
features.scrub: Active
features.scrub-freq: daily
For GlusterFS 5.5 its nearly the same except the fact
that there were 2 options to enable ctime feature.
Ctime stores additional metadata information as an extended
attributes which sometimes exceeds the default inode size. In such
scenarios the additional xattrs won't fit into the default size.
This will result in additional blocks to be used to store xattrs
in the inide, which will effect the latency. This is purely based
on the i/o operations and the total xattrs size stored in the
inode.
Is it possible for you to repeat the test by disabling ctime or
increasing the inode size to a higher value say 1024KB?
Our optimization for Samba looks like this (for every
version):
[global]
workgroup = SAMBA
netbios name = CLUSTER
kernel share modes = no
aio read size = 1
aio write size = 1
kernel oplocks = no
max open files = 100000
nt acl support = no
security = user
server min protocol = SMB2
store dos attributes = no
strict locking = no
full_audit:failure = pwrite_send pwrite_recv pwrite
offload_write_send offload_write_recv create_file open
unlink connect disconnect rename chown fchown lchown chmod
fchmod mkdir rmdir ntimes ftruncate fallocate
full_audit:success = pwrite_send pwrite_recv pwrite
offload_write_send offload_write_recv create_file open
unlink connect disconnect rename chown fchown lchown chmod
fchmod mkdir rmdir ntimes ftruncate fallocate
full_audit:facility = local5
durable handles = yes
posix locking = no
log level = 2
max log size = 100000
debug pid = yes
What can be the cause for this rapid falling of the
performance for small files? Are some of our vol options
not recommended anymore?
There were some patches concerning performance for
small files in v6.0 und v7.0 :
#1670031:
performance regression seen with smallfile
workload tests
#1659327: 43% regression in
small-file sequential read performance
And
one patch for the io-cache:
#1659869: improvements to io-cache
Regards
David
Spisla
________
Community Meeting Calendar:
APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314
NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
|
________
Community Meeting Calendar:
APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314
NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users