Dear all,
we would like to describe the situation that we have and
that does not solve since a long time, that means after
many minor
and major upgrades of GlusterFS
We use a KVM environment for VMs for glusterfs and host
servers are updated regularly. Hosts are disomogeneous
hardware,
but configured with same characteristics.
The VMs have been also harmonized to use the virtio
drivers where available for devices and resources
reserved are the same
on each host.
Physical switch for hosts has been substituted with a
reliable one.
Probing peers has been and is quite quick in the
heartbeat network and communication between the servers
for apparently has no issues on disruptions.
And I say apparently because what we have is:
- always pending failed heals that used to resolve by a
rotated reboot of the gluster vms (replica 3).
Restarting only
glusterfs related services (daemon, events etc.) has no
effect, only reboot brings results
- very often failed heals are directories
We lately removed a brick that was on a vm on a host
that has been entirely substituted. Re-added the brick,
sync went on and
all data was eventually synced and started with 0
pending failed heals. Now it develops failed heals too
like its fellow
bricks. Please take into account we healed all the
failed entries (manually with various methods) before
adding the third brick.
After some days of operating, the count of failed heals
rises again, not really fast but with new entries for
sure (which might solve
with rotated reboots, or not).
We have gluster clients also on ctdbs that connect to
the gluster and mount via glusterfs client. Windows
roaming profiles shared via smb become frequently
corrupted,(they are composed of a great number small
files and are though of big total dimension). Gluster
nodes are formatted with xfs.
Also what we observer is that mounting with the vfs
option in smb on the ctdbs has some kind of delay. This
means that you can see the shared folder on for example
a Windows client machine on a ctdb, but not on another
ctdb in the cluster and then after a while it appears
there too. And this frequently st
This is an excerpt of entries on our shd logs:
2024-04-08 10:13:26.213596
+0000] I [MSGID: 108026]
[afr-self-heal-entry.c:1080:afr_selfheal_entry_do]
0-gv-ho-replicate-0: performing full entry selfheal on
2c621415-6223-4b66-a4ca-3f6f267a448d
[2024-04-08 10:14:08.135911 +0000] W [MSGID: 114031]
[client-rpc-fops_v2.c:2457:client4_0_link_cbk]
0-gv-ho-client-5: remote operation failed.
[{source=<gfid:91d83f0e-1864-4ff3-9174-b7c956e20596>},
{target=(null)}, {errno=116}, {error=Veraltete
Dateizugriffsnummer (file handle)}]
[2024-04-08 10:15:59.135908 +0000] W [MSGID: 114061]
[client-common.c:2992:client_pre_readdir_v2]
0-gv-ho-client-5: remote_fd is -1. EBADFD
[{gfid=6b5e599e-c836-4ebe-b16a-8224425b88c7},
{errno=77}, {error=Die Dateizugriffsnummer ist in
schlechter Verfassung}]
[2024-04-08 10:30:25.013592 +0000] I [MSGID: 108026]
[afr-self-heal-entry.c:1080:afr_selfheal_entry_do]
0-gv-ho-replicate-0: performing full entry selfheal on
24e82e12-5512-4679-9eb3-8bd098367db7
[2024-04-08 10:33:17.613594 +0000] W [MSGID: 114031]
[client-rpc-fops_v2.c:2457:client4_0_link_cbk]
0-gv-ho-client-5: remote operation failed.
[{source=<gfid:ef9068fc-a329-4a21-88d2-265ecd3d208c>},
{target=(null)}, {errno=116}, {error=Veraltete
Dateizugriffsnummer (file handle)}]
[2024-04-08 10:33:21.201359 +0000] W [MSGID: 114031]
[client-rpc-fops_v2.c:2457:client4_0_link_cbk]
0-gv-ho-client-5: remote operation failed. [{source=
How are he clients mapped to real hosts in order to know
on which one´s logs to look at?
We would like to go by exclusion to finally eradicate
this, possibly in a conservative way (not rebuilding
everything) and we
are becoming clueless as to where to look at as we also
tried various options settings regarding performance
etc.
Here is the set on our main volume:
cluster.lookup-unhashed
on (DEFAULT)
cluster.lookup-optimize on (DEFAULT)
cluster.min-free-disk 10% (DEFAULT)
cluster.min-free-inodes 5% (DEFAULT)
cluster.rebalance-stats off (DEFAULT)
cluster.subvols-per-directory (null)
(DEFAULT)
cluster.readdir-optimize off (DEFAULT)
cluster.rsync-hash-regex (null)
(DEFAULT)
cluster.extra-hash-regex (null)
(DEFAULT)
cluster.dht-xattr-name
trusted.glusterfs.dht (DEFAULT)
cluster.randomize-hash-range-by-gfid off (DEFAULT)
cluster.rebal-throttle normal
(DEFAULT)
cluster.lock-migration off
cluster.force-migration off
cluster.local-volume-name (null)
(DEFAULT)
cluster.weighted-rebalance on (DEFAULT)
cluster.switch-pattern (null)
(DEFAULT)
cluster.entry-change-log on (DEFAULT)
cluster.read-subvolume (null)
(DEFAULT)
cluster.read-subvolume-index -1 (DEFAULT)
cluster.read-hash-mode 1 (DEFAULT)
cluster.background-self-heal-count 8 (DEFAULT)
cluster.metadata-self-heal on
cluster.data-self-heal on
cluster.entry-self-heal on
cluster.self-heal-daemon enable
cluster.heal-timeout 600 (DEFAULT)
cluster.self-heal-window-size 8 (DEFAULT)
cluster.data-change-log on (DEFAULT)
cluster.metadata-change-log on (DEFAULT)
cluster.data-self-heal-algorithm (null)
(DEFAULT)
cluster.eager-lock on (DEFAULT)
disperse.eager-lock on (DEFAULT)
disperse.other-eager-lock on (DEFAULT)
disperse.eager-lock-timeout 1 (DEFAULT)
disperse.other-eager-lock-timeout 1 (DEFAULT)
cluster.quorum-type auto
cluster.quorum-count 2
cluster.choose-local true
(DEFAULT)
cluster.self-heal-readdir-size 1KB (DEFAULT)
cluster.post-op-delay-secs 1 (DEFAULT)
cluster.ensure-durability on (DEFAULT)
cluster.consistent-metadata no (DEFAULT)
cluster.heal-wait-queue-length 128 (DEFAULT)
cluster.favorite-child-policy none
cluster.full-lock yes (DEFAULT)
cluster.optimistic-change-log on (DEFAULT)
diagnostics.latency-measurement off
diagnostics.dump-fd-stats off (DEFAULT)
diagnostics.count-fop-hits off
diagnostics.brick-log-level INFO
diagnostics.client-log-level INFO
diagnostics.brick-sys-log-level CRITICAL
(DEFAULT)
diagnostics.client-sys-log-level CRITICAL
(DEFAULT)
diagnostics.brick-logger (null)
(DEFAULT)
diagnostics.client-logger (null)
(DEFAULT)
diagnostics.brick-log-format (null)
(DEFAULT)
diagnostics.client-log-format (null)
(DEFAULT)
diagnostics.brick-log-buf-size 5 (DEFAULT)
diagnostics.client-log-buf-size 5 (DEFAULT)
diagnostics.brick-log-flush-timeout 120 (DEFAULT)
diagnostics.client-log-flush-timeout 120 (DEFAULT)
diagnostics.stats-dump-interval 0 (DEFAULT)
diagnostics.fop-sample-interval 0 (DEFAULT)
diagnostics.stats-dump-format json
(DEFAULT)
diagnostics.fop-sample-buf-size 65535
(DEFAULT)
diagnostics.stats-dnscache-ttl-sec 86400
(DEFAULT)
performance.cache-max-file-size 10
performance.cache-min-file-size 0 (DEFAULT)
performance.cache-refresh-timeout 1 (DEFAULT)
performance.cache-priority (DEFAULT)
performance.io-cache-size 32MB
(DEFAULT)
performance.cache-size 32MB
(DEFAULT)
performance.io-thread-count 16 (DEFAULT)
performance.high-prio-threads 16 (DEFAULT)
performance.normal-prio-threads 16 (DEFAULT)
performance.low-prio-threads 16 (DEFAULT)
performance.least-prio-threads 1 (DEFAULT)
performance.enable-least-priority on (DEFAULT)
performance.iot-watchdog-secs (null)
(DEFAULT)
performance.iot-cleanup-disconnected-reqs off
(DEFAULT)
performance.iot-pass-through false
(DEFAULT)
performance.io-cache-pass-through false
(DEFAULT)
performance.quick-read-cache-size 128MB
(DEFAULT)
performance.cache-size 128MB
(DEFAULT)
performance.quick-read-cache-timeout 1 (DEFAULT)
performance.qr-cache-timeout 600
performance.quick-read-cache-invalidation false
(DEFAULT)
performance.ctime-invalidation false
(DEFAULT)
performance.flush-behind on (DEFAULT)
performance.nfs.flush-behind on (DEFAULT)
performance.write-behind-window-size 4MB
performance.resync-failed-syncs-after-fsync off
(DEFAULT)
performance.nfs.write-behind-window-size 1MB (DEFAULT)
performance.strict-o-direct off (DEFAULT)
performance.nfs.strict-o-direct off (DEFAULT)
performance.strict-write-ordering off (DEFAULT)
performance.nfs.strict-write-ordering off (DEFAULT)
performance.write-behind-trickling-writes on (DEFAULT)
performance.aggregate-size 128KB
(DEFAULT)
performance.nfs.write-behind-trickling-writes on
(DEFAULT)
performance.lazy-open yes (DEFAULT)
performance.read-after-open yes (DEFAULT)
performance.open-behind-pass-through false
(DEFAULT)
performance.read-ahead-page-count 4 (DEFAULT)
performance.read-ahead-pass-through false
(DEFAULT)
performance.readdir-ahead-pass-through false
(DEFAULT)
performance.md-cache-pass-through false
(DEFAULT)
performance.write-behind-pass-through false
(DEFAULT)
performance.md-cache-timeout 600
performance.cache-swift-metadata false
(DEFAULT)
performance.cache-samba-metadata on
performance.cache-capability-xattrs true
(DEFAULT)
performance.cache-ima-xattrs true
(DEFAULT)
performance.md-cache-statfs off (DEFAULT)
performance.xattr-cache-list (DEFAULT)
performance.nl-cache-pass-through false
(DEFAULT)
network.frame-timeout 1800
(DEFAULT)
network.ping-timeout 20
network.tcp-window-size (null)
(DEFAULT)
client.ssl off
network.remote-dio disable
(DEFAULT)
client.event-threads 4
client.tcp-user-timeout 0
client.keepalive-time 20
client.keepalive-interval 2
client.keepalive-count 9
client.strict-locks off
network.tcp-window-size (null)
(DEFAULT)
network.inode-lru-limit 200000
auth.allow *
auth.reject (null)
(DEFAULT)
transport.keepalive 1
server.allow-insecure on (DEFAULT)
server.root-squash off (DEFAULT)
server.all-squash off (DEFAULT)
server.anonuid 65534
(DEFAULT)
server.anongid 65534
(DEFAULT)
server.statedump-path
/var/run/gluster (DEFAULT)
server.outstanding-rpc-limit 64 (DEFAULT)
server.ssl off
auth.ssl-allow *
server.manage-gids off (DEFAULT)
server.dynamic-auth on (DEFAULT)
client.send-gids on (DEFAULT)
server.gid-timeout 300 (DEFAULT)
server.own-thread (null)
(DEFAULT)
server.event-threads 4
server.tcp-user-timeout 42 (DEFAULT)
server.keepalive-time 20
server.keepalive-interval 2
server.keepalive-count 9
transport.listen-backlog 1024
ssl.own-cert (null)
(DEFAULT)
ssl.private-key (null)
(DEFAULT)
ssl.ca-list (null)
(DEFAULT)
ssl.crl-path (null)
(DEFAULT)
ssl.certificate-depth (null)
(DEFAULT)
ssl.cipher-list (null)
(DEFAULT)
ssl.dh-param (null)
(DEFAULT)
ssl.ec-curve (null)
(DEFAULT)
transport.address-family inet
performance.write-behind off
performance.read-ahead on
performance.readdir-ahead on
performance.io-cache off
performance.open-behind on
performance.quick-read on
performance.nl-cache on
performance.stat-prefetch on
performance.client-io-threads off
performance.nfs.write-behind on
performance.nfs.read-ahead off
performance.nfs.io-cache off
performance.nfs.quick-read off
performance.nfs.stat-prefetch off
performance.nfs.io-threads off
performance.force-readdirp true
(DEFAULT)
performance.cache-invalidation on
performance.global-cache-invalidation true
(DEFAULT)
features.uss off
features.snapshot-directory .snaps
features.show-snapshot-directory off
features.tag-namespaces off
network.compression off
network.compression.window-size -15 (DEFAULT)
network.compression.mem-level 8 (DEFAULT)
network.compression.min-size 0 (DEFAULT)
network.compression.compression-level -1 (DEFAULT)
network.compression.debug false
(DEFAULT)
features.default-soft-limit 80% (DEFAULT)
features.soft-timeout 60 (DEFAULT)
features.hard-timeout 5 (DEFAULT)
features.alert-time 86400
(DEFAULT)
features.quota-deem-statfs off
geo-replication.indexing off
geo-replication.indexing off
geo-replication.ignore-pid-check off
geo-replication.ignore-pid-check off
features.quota off
features.inode-quota off
features.bitrot disable
debug.trace off
debug.log-history no (DEFAULT)
debug.log-file no (DEFAULT)
debug.exclude-ops (null)
(DEFAULT)
debug.include-ops (null)
(DEFAULT)
debug.error-gen off
debug.error-failure (null)
(DEFAULT)
debug.error-number (null)
(DEFAULT)
debug.random-failure off (DEFAULT)
debug.error-fops (null)
(DEFAULT)
nfs.disable on
features.read-only off (DEFAULT)
features.worm off
features.worm-file-level off
features.worm-files-deletable on
features.default-retention-period 120 (DEFAULT)
features.retention-mode relax
(DEFAULT)
features.auto-commit-period 180 (DEFAULT)
storage.linux-aio off (DEFAULT)
storage.linux-io_uring off (DEFAULT)
storage.batch-fsync-mode reverse-fsync
(DEFAULT)
storage.batch-fsync-delay-usec 0 (DEFAULT)
storage.owner-uid -1 (DEFAULT)
storage.owner-gid -1 (DEFAULT)
storage.node-uuid-pathinfo off (DEFAULT)
storage.health-check-interval 30 (DEFAULT)
storage.build-pgfid off (DEFAULT)
storage.gfid2path on (DEFAULT)
storage.gfid2path-separator : (DEFAULT)
storage.reserve 1 (DEFAULT)
storage.health-check-timeout 20 (DEFAULT)
storage.fips-mode-rchecksum on
storage.force-create-mode 0000
(DEFAULT)
storage.force-directory-mode 0000
(DEFAULT)
storage.create-mask 0777
(DEFAULT)
storage.create-directory-mask 0777
(DEFAULT)
storage.max-hardlinks 100 (DEFAULT)
features.ctime on (DEFAULT)
config.gfproxyd off
cluster.server-quorum-type server
cluster.server-quorum-ratio 51
changelog.changelog off (DEFAULT)
changelog.changelog-dir {{ brick.path
}}/.glusterfs/changelogs (DEFAULT)
changelog.encoding ascii
(DEFAULT)
changelog.rollover-time 15 (DEFAULT)
changelog.fsync-interval 5 (DEFAULT)
changelog.changelog-barrier-timeout 120
changelog.capture-del-path off (DEFAULT)
features.barrier disable
features.barrier-timeout 120
features.trash off (DEFAULT)
features.trash-dir .trashcan
(DEFAULT)
features.trash-eliminate-path (null)
(DEFAULT)
features.trash-max-filesize 5MB (DEFAULT)
features.trash-internal-op off (DEFAULT)
cluster.enable-shared-storage disable
locks.trace off (DEFAULT)
locks.mandatory-locking off (DEFAULT)
cluster.disperse-self-heal-daemon enable
(DEFAULT)
cluster.quorum-reads no (DEFAULT)
client.bind-insecure (null)
(DEFAULT)
features.timeout 45 (DEFAULT)
features.failover-hosts (null)
(DEFAULT)
features.shard off
features.shard-block-size 64MB
(DEFAULT)
features.shard-lru-limit 16384
(DEFAULT)
features.shard-deletion-rate 100 (DEFAULT)
features.scrub-throttle lazy
features.scrub-freq biweekly
features.scrub false
(DEFAULT)
features.expiry-time 120
features.signer-threads 4
features.cache-invalidation on
features.cache-invalidation-timeout 600
ganesha.enable off
features.leases off
features.lease-lock-recall-timeout 60 (DEFAULT)
disperse.background-heals 8 (DEFAULT)
disperse.heal-wait-qlength 128 (DEFAULT)
cluster.heal-timeout 600 (DEFAULT)
dht.force-readdirp on (DEFAULT)
disperse.read-policy gfid-hash
(DEFAULT)
cluster.shd-max-threads 4
cluster.shd-wait-qlength 1024
(DEFAULT)
cluster.locking-scheme full
(DEFAULT)
cluster.granular-entry-heal no (DEFAULT)
features.locks-revocation-secs 0 (DEFAULT)
features.locks-revocation-clear-all false
(DEFAULT)
features.locks-revocation-max-blocked 0 (DEFAULT)
features.locks-monkey-unlocking false
(DEFAULT)
features.locks-notify-contention yes (DEFAULT)
features.locks-notify-contention-delay 5 (DEFAULT)
disperse.shd-max-threads 1 (DEFAULT)
disperse.shd-wait-qlength 4096
disperse.cpu-extensions auto
(DEFAULT)
disperse.self-heal-window-size 32 (DEFAULT)
cluster.use-compound-fops off
performance.parallel-readdir on
performance.rda-request-size 131072
performance.rda-low-wmark 4096
(DEFAULT)
performance.rda-high-wmark 128KB
(DEFAULT)
performance.rda-cache-limit 10MB
performance.nl-cache-positive-entry false
(DEFAULT)
performance.nl-cache-limit 10MB
performance.nl-cache-timeout 600
cluster.brick-multiplex disable
cluster.brick-graceful-cleanup disable
glusterd.vol_count_per_thread 100
cluster.max-bricks-per-process 250
disperse.optimistic-change-log on (DEFAULT)
disperse.stripe-cache 4 (DEFAULT)
cluster.halo-enabled False
(DEFAULT)
cluster.halo-shd-max-latency 99999
(DEFAULT)
cluster.halo-nfsd-max-latency 5 (DEFAULT)
cluster.halo-max-latency 5 (DEFAULT)
cluster.halo-max-replicas 99999
(DEFAULT)
cluster.halo-min-replicas 2 (DEFAULT)
features.selinux on
cluster.daemon-log-level INFO
debug.delay-gen off
delay-gen.delay-percentage 10% (DEFAULT)
delay-gen.delay-duration 100000
(DEFAULT)
delay-gen.enable (DEFAULT)
disperse.parallel-writes on (DEFAULT)
disperse.quorum-count 0 (DEFAULT)
features.sdfs off
features.cloudsync off
features.ctime on
ctime.noatime on
features.cloudsync-storetype (null)
(DEFAULT)
features.enforce-mandatory-lock off
config.global-threading off
config.client-threads 16
config.brick-threads 16
features.cloudsync-remote-read off
features.cloudsync-store-id (null)
(DEFAULT)
features.cloudsync-product-id (null)
(DEFAULT)
features.acl enable
cluster.use-anonymous-inode yes
rebalance.ensure-durability on (DEFAULT)
Again, sorry for the long post. We would be happy to
have this solved as we are excited using glusterfs and
we would like to go back to having a stable
configuration.
We always appreciate the spirit of collaboration and
reciprocal help on this list.
Best
Ilias
--
forumZFD
Entschieden für Frieden | Committed to Peace
Ilias Chasapakis
Referent IT | IT Consultant
Forum Ziviler Friedensdienst e.V. | Forum Civil Peace
Service
Am Kölner Brett 8 | 50825 Köln | Germany
Tel 0221 91273243 | Fax 0221 91273299 |
http://www.forumZFD.de
Vorstand nach § 26 BGB,
einzelvertretungsberechtigt|Executive Board:
Alexander Mauz, Sonja Wiekenberg-Mlalandle, Jens von
Bargen
VR 17651 Amtsgericht Köln
Spenden|Donations: IBAN DE90 4306 0967 4103 7264 00
BIC GENODEM1GLS
________
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge:
https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users