I met a big problem,the cluster rebalance takes a long time after adding a new node
gluster volume rebalance web status
Node Rebalanced-files size scanned failures skipped status run time in h:m:s
--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------
localhost 900 43.5MB 2232 0 69 in progress 0:36:49
gluster2 1052 39.3MB 4393 0 1052 in progress 0:36:49
Estimated time left for rebalance to complete : 9919:44:34
volume rebalance: web: success
the rebalance log
[glusterfsd.c:2511:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.12.8 (args: /usr/sbin/glusterfs -s localhost --volfile-id rebalance/web --xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes --xlator-option *dht.assert-no-child-down=yes --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off --xlator-option *dht.readdir-optimize=on --xlator-option *dht.rebalance-cmd=1 --xlator-option *dht.node-uuid=d47ad89d-7979-4ede-9aba-e04f020bb4f0 --xlator-option *dht.commit-hash=3610561770 --socket-file /var/run/gluster/gluster-rebalance-bdef10eb-1c83-410c-8ad3-fe286450004b.sock --pid-file /var/lib/glusterd/vols/web/rebalance/d47ad89d-7979-4ede-9aba-e04f020bb4f0.pid -l /var/log/glusterfs/web-rebalance.log)
[2018-04-30 04:20:45.100902] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
[2018-04-30 04:20:45.103927] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2018-04-30 04:20:55.191261] E [MSGID: 109039] [dht-common.c:3113:dht_find_local_subvol_cbk] 0-web-dht: getxattr err for dir [No data available]
[2018-04-30 04:21:19.783469] E [MSGID: 109023] [dht-rebalance.c:2669:gf_defrag_migrate_single_file] 0-web-dht: Migrate file failed: /2018/02/x187f6596-36ac-45e6-bd7a-019804dfe427.jpg, lookup failed [Stale file handle]
The message "E [MSGID: 109039] [dht-common.c:3113:dht_find_local_subvol_cbk] 0-web-dht: getxattr err for dir [No data available]" repeated 2 times between [2018-04-30 04:20:55.191261] and [2018-04-30 04:20:55.193615]
the gluster info
Volume Name: web
Type: Distribute
Volume ID: bdef10eb-1c83-410c-8ad3-fe286450004b
Status: Started
Snapshot Count: 0
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: gluster1:/home/export/md3/brick
Brick2: gluster1:/export/md2/brick
Brick3: gluster2:/home/export/md3/brick
Options Reconfigured:
nfs.trusted-sync: on
nfs.trusted-write: on
cluster.rebal-throttle: aggressive
features.inode-quota: off
features.quota: off
cluster.shd-wait-qlength: 1024
transport.address-family: inet
cluster.lookup-unhashed: auto
performance.cache-size: 1GB
performance.client-io-threads: on
performance.write-behind-window-size: 4MB
performance.io-thread-count: 8
performance.force-readdirp: on
performance.readdir-ahead: on
cluster.readdir-optimize: on
performance.high-prio-threads: 8
performance.flush-behind: on
performance.write-behind: on
performance.quick-read: off
performance.io-cache: on
performance.read-ahead: off
server.event-threads: 8
cluster.lookup-optimize: on
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: off
performance.md-cache-timeout: 60
network.inode-lru-limit: 90000
diagnostics.brick-log-level: ERROR
diagnostics.brick-sys-log-level: ERROR
diagnostics.client-log-level: ERROR
diagnostics.client-sys-log-level: ERROR
cluster.min-free-disk: 20%
cluster.self-heal-window-size: 16
cluster.self-heal-readdir-size: 1024
cluster.background-self-heal-count: 4
cluster.heal-wait-queue-length: 128
client.event-threads: 8
performance.cache-invalidation: on
nfs.disable: off
nfs.acl: off
cluster.brick-multiplex: disable
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users