Hi All,
We have set up a new gluster volume, server version 9.3 with clients running 9.0. In this setup we are facing issues where the clients get randomly disconnected and there is no relevant log around that time in client, server and brick logs. One difference we have noticed in this setup vs the other setups which we have in production is that there are many small read/writes happening in this.
Wondering if anyone can help with the performance tuning parameters which we can set/unset to optimize our setup.
Any suggestions are welcome and appreciated.
Adding the current server parameters:
--
We have set up a new gluster volume, server version 9.3 with clients running 9.0. In this setup we are facing issues where the clients get randomly disconnected and there is no relevant log around that time in client, server and brick logs. One difference we have noticed in this setup vs the other setups which we have in production is that there are many small read/writes happening in this.
Wondering if anyone can help with the performance tuning parameters which we can set/unset to optimize our setup.
Any suggestions are welcome and appreciated.
Adding the current server parameters:
Option Value
------ -----
cluster.lookup-unhashed on (DEFAULT)
cluster.lookup-optimize on (DEFAULT)
cluster.min-free-disk 10% (DEFAULT)
cluster.min-free-inodes 5% (DEFAULT)
cluster.rebalance-stats off (DEFAULT)
cluster.subvols-per-directory (null) (DEFAULT)
cluster.readdir-optimize off (DEFAULT)
cluster.rsync-hash-regex (null) (DEFAULT)
cluster.extra-hash-regex (null) (DEFAULT)
cluster.dht-xattr-name trusted.glusterfs.dht (DEFAULT)
cluster.randomize-hash-range-by-gfid off (DEFAULT)
cluster.rebal-throttle normal (DEFAULT)
cluster.lock-migration off
cluster.force-migration off
cluster.local-volume-name (null) (DEFAULT)
cluster.weighted-rebalance on (DEFAULT)
cluster.switch-pattern (null) (DEFAULT)
cluster.entry-change-log on (DEFAULT)
cluster.read-subvolume (null) (DEFAULT)
cluster.read-subvolume-index -1 (DEFAULT)
cluster.read-hash-mode 1 (DEFAULT)
cluster.background-self-heal-count 8 (DEFAULT)
cluster.metadata-self-heal off (DEFAULT)
cluster.data-self-heal off (DEFAULT)
cluster.entry-self-heal off (DEFAULT)
cluster.self-heal-daemon on (DEFAULT)
cluster.heal-timeout 600 (DEFAULT)
cluster.self-heal-window-size 8 (DEFAULT)
cluster.data-change-log on (DEFAULT)
cluster.metadata-change-log on (DEFAULT)
cluster.data-self-heal-algorithm (null) (DEFAULT)
cluster.eager-lock on (DEFAULT)
disperse.eager-lock on (DEFAULT)
disperse.other-eager-lock on (DEFAULT)
disperse.eager-lock-timeout 1 (DEFAULT)
disperse.other-eager-lock-timeout 1 (DEFAULT)
cluster.quorum-type none (DEFAULT)
cluster.quorum-count (null) (DEFAULT)
cluster.choose-local true (DEFAULT)
cluster.self-heal-readdir-size 1KB (DEFAULT)
cluster.post-op-delay-secs 1 (DEFAULT)
cluster.ensure-durability on (DEFAULT)
cluster.consistent-metadata no (DEFAULT)
cluster.heal-wait-queue-length 128 (DEFAULT)
cluster.favorite-child-policy none (DEFAULT)
cluster.full-lock yes (DEFAULT)
cluster.optimistic-change-log on (DEFAULT)
diagnostics.latency-measurement off
diagnostics.dump-fd-stats off (DEFAULT)
diagnostics.count-fop-hits off
diagnostics.brick-log-level INFO
diagnostics.client-log-level INFO
diagnostics.brick-sys-log-level CRITICAL (DEFAULT)
diagnostics.client-sys-log-level CRITICAL (DEFAULT)
diagnostics.brick-logger (null) (DEFAULT)
diagnostics.client-logger (null) (DEFAULT)
diagnostics.brick-log-format (null) (DEFAULT)
diagnostics.client-log-format (null) (DEFAULT)
diagnostics.brick-log-buf-size 5 (DEFAULT)
diagnostics.client-log-buf-size 5 (DEFAULT)
diagnostics.brick-log-flush-timeout 120 (DEFAULT)
diagnostics.client-log-flush-timeout 120 (DEFAULT)
diagnostics.stats-dump-interval 0 (DEFAULT)
diagnostics.fop-sample-interval 0 (DEFAULT)
diagnostics.stats-dump-format json (DEFAULT)
diagnostics.fop-sample-buf-size 65535 (DEFAULT)
diagnostics.stats-dnscache-ttl-sec 86400 (DEFAULT)
performance.cache-max-file-size 0 (DEFAULT)
performance.cache-min-file-size 0 (DEFAULT)
performance.cache-refresh-timeout 1 (DEFAULT)
performance.cache-priority (DEFAULT)
performance.io-cache-size 32MB (DEFAULT)
performance.cache-size 256MB
performance.io-thread-count 16 (DEFAULT)
performance.high-prio-threads 16 (DEFAULT)
performance.normal-prio-threads 16 (DEFAULT)
performance.low-prio-threads 16 (DEFAULT)
performance.least-prio-threads 1 (DEFAULT)
performance.enable-least-priority on (DEFAULT)
performance.iot-watchdog-secs (null) (DEFAULT)
performance.iot-cleanup-disconnected-reqs off (DEFAULT)
performance.iot-pass-through false (DEFAULT)
performance.io-cache-pass-through false (DEFAULT)
performance.quick-read-cache-size 128MB (DEFAULT)
performance.cache-size 256MB
performance.quick-read-cache-timeout 1 (DEFAULT)
performance.qr-cache-timeout 600
performance.quick-read-cache-invalidation false (DEFAULT)
performance.ctime-invalidation false (DEFAULT)
performance.flush-behind on (DEFAULT)
performance.nfs.flush-behind on (DEFAULT)
performance.write-behind-window-size 1MB (DEFAULT)
performance.resync-failed-syncs-after-fsync off (DEFAULT)
performance.nfs.write-behind-window-size 1MB (DEFAULT)
performance.strict-o-direct off (DEFAULT)
performance.nfs.strict-o-direct off (DEFAULT)
performance.strict-write-ordering off (DEFAULT)
performance.nfs.strict-write-ordering off (DEFAULT)
performance.write-behind-trickling-writes on (DEFAULT)
performance.aggregate-size 128KB (DEFAULT)
performance.nfs.write-behind-trickling-writes on (DEFAULT)
performance.lazy-open yes (DEFAULT)
performance.read-after-open yes (DEFAULT)
performance.open-behind-pass-through false (DEFAULT)
performance.read-ahead-page-count 4 (DEFAULT)
performance.read-ahead-pass-through false (DEFAULT)
performance.readdir-ahead-pass-through false (DEFAULT)
performance.md-cache-pass-through false (DEFAULT)
performance.write-behind-pass-through false (DEFAULT)
performance.md-cache-timeout 1 (DEFAULT)
performance.cache-swift-metadata false (DEFAULT)
performance.cache-samba-metadata false (DEFAULT)
performance.cache-capability-xattrs true (DEFAULT)
performance.cache-ima-xattrs true (DEFAULT)
performance.md-cache-statfs off (DEFAULT)
performance.xattr-cache-list (DEFAULT)
performance.nl-cache-pass-through false (DEFAULT)
network.frame-timeout 1800 (DEFAULT)
network.ping-timeout 42 (DEFAULT)
network.tcp-window-size (null) (DEFAULT)
client.ssl off
network.remote-dio disable (DEFAULT)
client.event-threads 12
client.tcp-user-timeout 0
client.keepalive-time 20000
client.keepalive-interval 2000
client.keepalive-count 9
client.strict-locks off
network.tcp-window-size (null) (DEFAULT)
network.inode-lru-limit 50000
auth.allow *
auth.reject (null) (DEFAULT)
transport.keepalive 1
server.allow-insecure on (DEFAULT)
server.root-squash off (DEFAULT)
server.all-squash off (DEFAULT)
server.anonuid 65534 (DEFAULT)
server.anongid 65534 (DEFAULT)
server.statedump-path /var/run/gluster (DEFAULT)
server.outstanding-rpc-limit 128
server.ssl off
auth.ssl-allow *
server.manage-gids off (DEFAULT)
server.dynamic-auth on (DEFAULT)
client.send-gids on (DEFAULT)
server.gid-timeout 300 (DEFAULT)
server.own-thread (null) (DEFAULT)
server.event-threads 12
server.tcp-user-timeout 42 (DEFAULT)
server.keepalive-time 20
server.keepalive-interval 2
server.keepalive-count 9
transport.listen-backlog 1024
ssl.own-cert (null) (DEFAULT)
ssl.private-key (null) (DEFAULT)
ssl.ca-list (null) (DEFAULT)
ssl.crl-path (null) (DEFAULT)
ssl.certificate-depth (null) (DEFAULT)
ssl.cipher-list (null) (DEFAULT)
ssl.dh-param (null) (DEFAULT)
ssl.ec-curve (null) (DEFAULT)
transport.address-family inet
performance.write-behind on
performance.read-ahead off
performance.readdir-ahead off
performance.io-cache off
performance.open-behind on
performance.quick-read on
performance.nl-cache on
performance.stat-prefetch on
performance.client-io-threads on
performance.nfs.write-behind on
performance.nfs.read-ahead off
performance.nfs.io-cache off
performance.nfs.quick-read off
performance.nfs.stat-prefetch off
performance.nfs.io-threads off
performance.force-readdirp true (DEFAULT)
performance.cache-invalidation on
performance.global-cache-invalidation true (DEFAULT)
features.uss off
features.snapshot-directory .snaps
features.show-snapshot-directory off
features.tag-namespaces off
network.compression off
network.compression.window-size -15 (DEFAULT)
network.compression.mem-level 8 (DEFAULT)
network.compression.min-size 0 (DEFAULT)
network.compression.compression-level -1 (DEFAULT)
network.compression.debug false (DEFAULT)
features.default-soft-limit 80% (DEFAULT)
features.soft-timeout 60 (DEFAULT)
features.hard-timeout 5 (DEFAULT)
features.alert-time 86400 (DEFAULT)
features.quota-deem-statfs off
geo-replication.indexing off
geo-replication.indexing off
geo-replication.ignore-pid-check off
geo-replication.ignore-pid-check off
features.quota off
features.inode-quota off
features.bitrot disable
debug.trace off
debug.log-history no (DEFAULT)
debug.log-file no (DEFAULT)
debug.exclude-ops (null) (DEFAULT)
debug.include-ops (null) (DEFAULT)
debug.error-gen off
debug.error-failure (null) (DEFAULT)
debug.error-number (null) (DEFAULT)
debug.random-failure off (DEFAULT)
debug.error-fops (null) (DEFAULT)
nfs.disable on
features.read-only off (DEFAULT)
features.worm off
features.worm-file-level off
features.worm-files-deletable on
features.default-retention-period 120 (DEFAULT)
features.retention-mode relax (DEFAULT)
features.auto-commit-period 180 (DEFAULT)
storage.linux-aio off (DEFAULT)
storage.linux-io_uring off (DEFAULT)
storage.batch-fsync-mode reverse-fsync (DEFAULT)
storage.batch-fsync-delay-usec 0 (DEFAULT)
storage.owner-uid -1 (DEFAULT)
storage.owner-gid -1 (DEFAULT)
storage.node-uuid-pathinfo off (DEFAULT)
storage.health-check-interval 30 (DEFAULT)
storage.build-pgfid off (DEFAULT)
storage.gfid2path on (DEFAULT)
storage.gfid2path-separator : (DEFAULT)
storage.reserve 1 (DEFAULT)
storage.health-check-timeout 20 (DEFAULT)
storage.fips-mode-rchecksum off
storage.force-create-mode 0000 (DEFAULT)
storage.force-directory-mode 0000 (DEFAULT)
storage.create-mask 0777 (DEFAULT)
storage.create-directory-mask 0777 (DEFAULT)
storage.max-hardlinks 100 (DEFAULT)
features.ctime on (DEFAULT)
config.gfproxyd off
cluster.server-quorum-type off cluster.server-quorum-ratio 51
changelog.changelog off (DEFAULT)
changelog.changelog-dir {{ brick.path }}/.glusterfs/changelogs (DEFAULT)
changelog.encoding ascii (DEFAULT)
changelog.rollover-time 15 (DEFAULT)
changelog.fsync-interval 5 (DEFAULT)
changelog.changelog-barrier-timeout 120
changelog.capture-del-path off (DEFAULT)
features.barrier disable
features.barrier-timeout 120
features.trash off (DEFAULT)
features.trash-dir .trashcan (DEFAULT)
features.trash-eliminate-path (null) (DEFAULT)
features.trash-max-filesize 5MB (DEFAULT)
features.trash-internal-op off (DEFAULT)
cluster.enable-shared-storage disable
locks.trace off (DEFAULT)
locks.mandatory-locking off (DEFAULT)
cluster.disperse-self-heal-daemon enable (DEFAULT)
cluster.quorum-reads no (DEFAULT)
client.bind-insecure (null) (DEFAULT)
features.shard off
features.shard-block-size 64MB (DEFAULT)
features.shard-lru-limit 16384 (DEFAULT)
features.shard-deletion-rate 100 (DEFAULT)
features.scrub-throttle lazy
features.scrub-freq biweekly
features.scrub false (DEFAULT)
features.expiry-time 120
features.signer-threads 4
features.cache-invalidation on
features.cache-invalidation-timeout 600
ganesha.enable off
features.leases off
features.lease-lock-recall-timeout 60 (DEFAULT)
disperse.background-heals 8 (DEFAULT)
disperse.heal-wait-qlength 128 (DEFAULT)
cluster.heal-timeout 600 (DEFAULT)
dht.force-readdirp on (DEFAULT)
disperse.read-policy gfid-hash (DEFAULT)
cluster.shd-max-threads 1 (DEFAULT)
cluster.shd-wait-qlength 1024 (DEFAULT)
cluster.locking-scheme full (DEFAULT)
cluster.granular-entry-heal no (DEFAULT)
features.locks-revocation-secs 0 (DEFAULT)
features.locks-revocation-clear-all false (DEFAULT)
features.locks-revocation-max-blocked 0 (DEFAULT)
features.locks-monkey-unlocking false (DEFAULT)
features.locks-notify-contention yes (DEFAULT)
features.locks-notify-contention-delay 5 (DEFAULT)
disperse.shd-max-threads 1 (DEFAULT)
disperse.shd-wait-qlength 1024 (DEFAULT)
disperse.cpu-extensions auto (DEFAULT)
disperse.self-heal-window-size 32 (DEFAULT)
cluster.use-compound-fops off
performance.parallel-readdir on
performance.rda-request-size 131072
performance.rda-low-wmark 4096 (DEFAULT)
performance.rda-high-wmark 128KB (DEFAULT)
performance.rda-cache-limit 10MB
performance.nl-cache-positive-entry false (DEFAULT)
performance.nl-cache-limit 10MB
performance.nl-cache-timeout 600
cluster.brick-multiplex disable
glusterd.vol_count_per_thread 100
cluster.max-bricks-per-process 250
disperse.optimistic-change-log on (DEFAULT)
disperse.stripe-cache 4 (DEFAULT)
cluster.halo-enabled False (DEFAULT)
cluster.halo-shd-max-latency 99999 (DEFAULT)
cluster.halo-nfsd-max-latency 5 (DEFAULT)
cluster.halo-max-latency 5 (DEFAULT)
cluster.halo-max-replicas 99999 (DEFAULT)
cluster.halo-min-replicas 2 (DEFAULT)
features.selinux on
cluster.daemon-log-level INFO
debug.delay-gen off
delay-gen.delay-percentage 10% (DEFAULT)
delay-gen.delay-duration 100000 (DEFAULT)
delay-gen.enable (DEFAULT)
disperse.parallel-writes on (DEFAULT)
disperse.quorum-count 0 (DEFAULT)
features.sdfs off
features.cloudsync off
features.ctime on
ctime.noatime on
features.cloudsync-storetype (null) (DEFAULT)
features.enforce-mandatory-lock off
config.global-threading off
config.client-threads 16
config.brick-threads 16
features.cloudsync-remote-read off
features.cloudsync-store-id (null) (DEFAULT)
features.cloudsync-product-id (null) (DEFAULT)
features.acl enable
cluster.use-anonymous-inode yes
Regards,
Shreyansh Shah
Shreyansh Shah
AlphaGrep Securities Pvt. Ltd.
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users