Hi,
it seems to be crashing inside open-behind. There's a known bug in that xlator that caused a crash. It has been fixed in 7.7, recently released. Can you try to upgrade ?
Xavi
On Fri, Jul 24, 2020 at 8:50 AM <nico@xxxxxxxxxx> wrote:
________We're using gluster in a production environement, 3 nodes (2 data + 1 arbiter).One of our VM gluster fuse client is regularly crashing on a particular volume, we recently upgraded all nodes and client to 7.6 but client is still crashing.All cluster nodes & client are Debian stretch (9.12), gluster was installed from our local gluster apt repository mirror and op-version is set to 70200.Volume contains a lot of files & directories but performance doesn't really matters, it seems to crash during this command :find logscli -mtime +1 -type f | tar c -T - -f - --remove-files | tar xpf - -C /drbdVolume was remonted this morning with DEBUG log level, waiting for next crash.Volume attributes areOption Value------ -----cluster.lookup-unhashed oncluster.lookup-optimize oncluster.min-free-disk 10%cluster.min-free-inodes 5%cluster.rebalance-stats offcluster.subvols-per-directory (null)cluster.readdir-optimize offcluster.rsync-hash-regex (null)cluster.extra-hash-regex (null)cluster.dht-xattr-name trusted.glusterfs.dhtcluster.randomize-hash-range-by-gfid offcluster.rebal-throttle normalcluster.lock-migration offcluster.force-migration offcluster.local-volume-name (null)cluster.weighted-rebalance oncluster.switch-pattern (null)cluster.entry-change-log oncluster.read-subvolume (null)cluster.read-subvolume-index -1cluster.read-hash-mode 1cluster.background-self-heal-count 8cluster.metadata-self-heal offcluster.data-self-heal offcluster.entry-self-heal offcluster.self-heal-daemon enablecluster.heal-timeout 60cluster.self-heal-window-size 1cluster.data-change-log oncluster.metadata-change-log oncluster.data-self-heal-algorithm fullcluster.eager-lock ondisperse.eager-lock ondisperse.other-eager-lock ondisperse.eager-lock-timeout 1disperse.other-eager-lock-timeout 1cluster.quorum-type fixedcluster.quorum-count 1cluster.choose-local truecluster.self-heal-readdir-size 1KBcluster.post-op-delay-secs 1cluster.ensure-durability oncluster.consistent-metadata nocluster.heal-wait-queue-length 128cluster.favorite-child-policy nonecluster.full-lock yescluster.optimistic-change-log ondiagnostics.latency-measurement offdiagnostics.dump-fd-stats offdiagnostics.count-fop-hits offdiagnostics.brick-log-level INFOdiagnostics.client-log-level ERRORdiagnostics.brick-sys-log-level CRITICALdiagnostics.client-sys-log-level CRITICALdiagnostics.brick-logger (null)diagnostics.client-logger (null)diagnostics.brick-log-format (null)diagnostics.client-log-format (null)diagnostics.brick-log-buf-size 5diagnostics.client-log-buf-size 5diagnostics.brick-log-flush-timeout 120diagnostics.client-log-flush-timeout 120diagnostics.stats-dump-interval 0diagnostics.fop-sample-interval 0diagnostics.stats-dump-format jsondiagnostics.fop-sample-buf-size 65535diagnostics.stats-dnscache-ttl-sec 86400performance.cache-max-file-size 0performance.cache-min-file-size 0performance.cache-refresh-timeout 1performance.cache-priorityperformance.cache-size 32MBperformance.io-thread-count 16performance.high-prio-threads 16performance.normal-prio-threads 16performance.low-prio-threads 16performance.least-prio-threads 1performance.enable-least-priority onperformance.iot-watchdog-secs (null)performance.iot-cleanup-disconnected-reqsoffperformance.iot-pass-through falseperformance.io-cache-pass-through falseperformance.cache-size 128MBperformance.qr-cache-timeout 1performance.cache-invalidation falseperformance.ctime-invalidation falseperformance.flush-behind onperformance.nfs.flush-behind onperformance.write-behind-window-size 1MBperformance.resync-failed-syncs-after-fsyncoffperformance.nfs.write-behind-window-size1MBperformance.strict-o-direct offperformance.nfs.strict-o-direct offperformance.strict-write-ordering offperformance.nfs.strict-write-ordering offperformance.write-behind-trickling-writesonperformance.aggregate-size 128KBperformance.nfs.write-behind-trickling-writesonperformance.lazy-open yesperformance.read-after-open yesperformance.open-behind-pass-through falseperformance.read-ahead-page-count 4performance.read-ahead-pass-through falseperformance.readdir-ahead-pass-through falseperformance.md-cache-pass-through falseperformance.md-cache-timeout 1performance.cache-swift-metadata trueperformance.cache-samba-metadata falseperformance.cache-capability-xattrs trueperformance.cache-ima-xattrs trueperformance.md-cache-statfs offperformance.xattr-cache-listperformance.nl-cache-pass-through falsenetwork.frame-timeout 1800network.ping-timeout 5network.tcp-window-size (null)client.ssl onnetwork.remote-dio disableclient.event-threads 2client.tcp-user-timeout 0client.keepalive-time 20client.keepalive-interval 2client.keepalive-count 9network.tcp-window-size (null)network.inode-lru-limit 16384auth.allow *auth.reject (null)transport.keepalive 1server.allow-insecure onserver.root-squash offserver.all-squash offserver.anonuid 65534server.anongid 65534server.statedump-path /var/run/glusterserver.outstanding-rpc-limit 64server.ssl onauth.ssl-allow *server.manage-gids offserver.dynamic-auth onclient.send-gids onserver.gid-timeout 300server.own-thread (null)server.event-threads 2server.tcp-user-timeout 42server.keepalive-time 20server.keepalive-interval 2server.keepalive-count 9transport.listen-backlog 1024ssl.cipher-list HIGH:!SSLv2transport.address-family inetperformance.write-behind onperformance.read-ahead onperformance.readdir-ahead onperformance.io-cache onperformance.open-behind onperformance.quick-read onperformance.nl-cache offperformance.stat-prefetch onperformance.client-io-threads offperformance.nfs.write-behind onperformance.nfs.read-ahead offperformance.nfs.io-cache offperformance.nfs.quick-read offperformance.nfs.stat-prefetch offperformance.nfs.io-threads offperformance.force-readdirp trueperformance.cache-invalidation falseperformance.global-cache-invalidation truefeatures.uss offfeatures.snapshot-directory .snapsfeatures.show-snapshot-directory offfeatures.tag-namespaces offnetwork.compression offnetwork.compression.window-size -15network.compression.mem-level 8network.compression.min-size 0network.compression.compression-level -1network.compression.debug falsefeatures.default-soft-limit 80%features.soft-timeout 60features.hard-timeout 5features.alert-time 86400features.quota-deem-statfs offgeo-replication.indexing offgeo-replication.indexing offgeo-replication.ignore-pid-check offgeo-replication.ignore-pid-check offfeatures.quota offfeatures.inode-quota offfeatures.bitrot disabledebug.trace offdebug.log-history nodebug.log-file nodebug.exclude-ops (null)debug.include-ops (null)debug.error-gen offdebug.error-failure (null)debug.error-number (null)debug.random-failure offdebug.error-fops (null)nfs.disable onfeatures.read-only offfeatures.worm offfeatures.worm-file-level offfeatures.worm-files-deletable onfeatures.default-retention-period 120features.retention-mode relaxfeatures.auto-commit-period 180storage.linux-aio offstorage.batch-fsync-mode reverse-fsyncstorage.batch-fsync-delay-usec 0storage.owner-uid -1storage.owner-gid -1storage.node-uuid-pathinfo offstorage.health-check-interval 30storage.build-pgfid offstorage.gfid2path onstorage.gfid2path-separator :storage.reserve 1storage.reserve-size 0storage.health-check-timeout 10storage.fips-mode-rchecksum offstorage.force-create-mode 0000storage.force-directory-mode 0000storage.create-mask 0777storage.create-directory-mask 0777storage.max-hardlinks 100features.ctime offconfig.gfproxyd offcluster.server-quorum-type offcluster.server-quorum-ratio 51changelog.changelog offchangelog.changelog-dir {{ brick.path }}/.glusterfs/changelogschangelog.encoding asciichangelog.rollover-time 15changelog.fsync-interval 5changelog.changelog-barrier-timeout 120changelog.capture-del-path offfeatures.barrier disablefeatures.barrier-timeout 120features.trash offfeatures.trash-dir .trashcanfeatures.trash-eliminate-path (null)features.trash-max-filesize 5MBfeatures.trash-internal-op offcluster.enable-shared-storage disablelocks.trace offlocks.mandatory-locking offcluster.disperse-self-heal-daemon enablecluster.quorum-reads falseclient.bind-insecure (null)features.timeout 45features.failover-hosts (null)features.shard offfeatures.shard-block-size 64MBfeatures.shard-lru-limit 16384features.shard-deletion-rate 100features.scrub-throttle lazyfeatures.scrub-freq biweeklyfeatures.scrub falsefeatures.expiry-time 120features.cache-invalidation offfeatures.cache-invalidation-timeout 60features.leases offfeatures.lease-lock-recall-timeout 60disperse.background-heals 8disperse.heal-wait-qlength 128cluster.heal-timeout 60dht.force-readdirp ondisperse.read-policy gfid-hashcluster.shd-max-threads 1cluster.shd-wait-qlength 1024cluster.locking-scheme fullcluster.granular-entry-heal nofeatures.locks-revocation-secs 0features.locks-revocation-clear-all falsefeatures.locks-revocation-max-blocked 0features.locks-monkey-unlocking falsefeatures.locks-notify-contention nofeatures.locks-notify-contention-delay 5disperse.shd-max-threads 1disperse.shd-wait-qlength 1024disperse.cpu-extensions autodisperse.self-heal-window-size 1cluster.use-compound-fops offperformance.parallel-readdir offperformance.rda-request-size 131072performance.rda-low-wmark 4096performance.rda-high-wmark 128KBperformance.rda-cache-limit 10MBperformance.nl-cache-positive-entry falseperformance.nl-cache-limit 10MBperformance.nl-cache-timeout 60cluster.brick-multiplex disableglusterd.vol_count_per_thread 100cluster.max-bricks-per-process 250disperse.optimistic-change-log ondisperse.stripe-cache 4cluster.halo-enabled Falsecluster.halo-shd-max-latency 99999cluster.halo-nfsd-max-latency 5cluster.halo-max-latency 5cluster.halo-max-replicas 99999cluster.halo-min-replicas 2features.selinux oncluster.daemon-log-level INFOdebug.delay-gen offdelay-gen.delay-percentage 10%delay-gen.delay-duration 100000delay-gen.enabledisperse.parallel-writes onfeatures.sdfs offfeatures.cloudsync offfeatures.ctime offctime.noatime onfeatures.cloudsync-storetype (null)features.enforce-mandatory-lock offconfig.global-threading offconfig.client-threads 16config.brick-threads 16features.cloudsync-remote-read offfeatures.cloudsync-store-id (null)features.cloudsync-product-id (null)Crash log found in /var/log/glusterfs/partage-logscli.log2020-07-23 02:34:36configuration details:argp 1backtrace 1dlfcn 1libpthread 1llistxattr 1setfsid 1spinlock 1epoll.h 1xattr.h 1st_atim.tv_nsec 1package-string: glusterfs 7.6/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x25e50)[0x7fbe02138e50]/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x2f7)[0x7fbe021434b7]/lib/x86_64-linux-gnu/libc.so.6(+0x33060)[0x7fbe00b87060]/lib/x86_64-linux-gnu/libpthread.so.0(pthread_mutex_lock+0x0)[0x7fbe01396b40]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/performance/open-behind.so(+0x32f5)[0x7fbdfa89b2f5]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/performance/open-behind.so(+0x3c52)[0x7fbdfa89bc52]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/performance/open-behind.so(+0x3dac)[0x7fbdfa89bdac]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/performance/open-behind.so(+0x3f73)[0x7fbdfa89bf73]/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_unlink+0xbc)[0x7fbe021c401c]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/performance/md-cache.so(+0x4495)[0x7fbdfa46c495]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/debug/io-stats.so(+0x5f44)[0x7fbdfa23af44]/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_unlink+0xbc)[0x7fbe021c401c]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/mount/fuse.so(+0x1154f)[0x7fbdff7dc54f]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/mount/fuse.so(+0x7775)[0x7fbdff7d2775]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/mount/fuse.so(+0x74c8)[0x7fbdff7d24c8]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/mount/fuse.so(+0x77be)[0x7fbdff7d27be]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/mount/fuse.so(+0x6ac3)[0x7fbdff7d1ac3]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/mount/fuse.so(+0x7188)[0x7fbdff7d2188]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/mount/fuse.so(+0x74e8)[0x7fbdff7d24e8]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/mount/fuse.so(+0x779e)[0x7fbdff7d279e]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/mount/fuse.so(+0x77e0)[0x7fbdff7d27e0]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/mount/fuse.so(+0x83f9)[0x7fbdff7d33f9]/usr/lib/x86_64-linux-gnu/glusterfs/7.6/xlator/mount/fuse.so(+0x21d3c)[0x7fbdff7ecd3c]/lib/x86_64-linux-gnu/libpthread.so.0(+0x74a4)[0x7fbe013944a4]/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7fbe00c3cd0f]
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users