Re: Gluster rebalance taking many years

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,


This value is an ongoing rough estimate based on the amount of data rebalance has migrated since it started. The values will cange as the rebalance progresses.
A few questions:
  1. How many files/dirs do you have on this volume? 
  2. What is the average size of the files?
  3. What is the total size of the data on the volume?

Can you send us the rebalance log?


Thanks,
Nithya

On 30 April 2018 at 10:33, kiwizhang618 <kiwizhang618@xxxxxxxxx> wrote:
 I met a big problem,the cluster rebalance takes a long time after adding a new node

gluster volume rebalance web status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost              900        43.5MB          2232             0            69          in progress        0:36:49
                                gluster2             1052        39.3MB          4393             0          1052          in progress        0:36:49
Estimated time left for rebalance to complete :     9919:44:34
volume rebalance: web: success

the rebalance log
[glusterfsd.c:2511:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.12.8 (args: /usr/sbin/glusterfs -s localhost --volfile-id rebalance/web --xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes --xlator-option *dht.assert-no-child-down=yes --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off --xlator-option *dht.readdir-optimize=on --xlator-option *dht.rebalance-cmd=1 --xlator-option *dht.node-uuid=d47ad89d-7979-4ede-9aba-e04f020bb4f0 --xlator-option *dht.commit-hash=3610561770 --socket-file /var/run/gluster/gluster-rebalance-bdef10eb-1c83-410c-8ad3-fe286450004b.sock --pid-file /var/lib/glusterd/vols/web/rebalance/d47ad89d-7979-4ede-9aba-e04f020bb4f0.pid -l /var/log/glusterfs/web-rebalance.log)
[2018-04-30 04:20:45.100902] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
[2018-04-30 04:20:45.103927] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2018-04-30 04:20:55.191261] E [MSGID: 109039] [dht-common.c:3113:dht_find_local_subvol_cbk] 0-web-dht: getxattr err for dir [No data available]
[2018-04-30 04:21:19.783469] E [MSGID: 109023] [dht-rebalance.c:2669:gf_defrag_migrate_single_file] 0-web-dht: Migrate file failed: /2018/02/x187f6596-36ac-45e6-bd7a-019804dfe427.jpg, lookup failed [Stale file handle]
The message "E [MSGID: 109039] [dht-common.c:3113:dht_find_local_subvol_cbk] 0-web-dht: getxattr err for dir [No data available]" repeated 2 times between [2018-04-30 04:20:55.191261] and [2018-04-30 04:20:55.193615]

the gluster info
Volume Name: web
Type: Distribute
Volume ID: bdef10eb-1c83-410c-8ad3-fe286450004b
Status: Started
Snapshot Count: 0
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: gluster1:/home/export/md3/brick
Brick2: gluster1:/export/md2/brick
Brick3: gluster2:/home/export/md3/brick
Options Reconfigured:
nfs.trusted-sync: on
nfs.trusted-write: on
cluster.rebal-throttle: aggressive
features.inode-quota: off
features.quota: off
cluster.shd-wait-qlength: 1024
transport.address-family: inet
cluster.lookup-unhashed: auto
performance.cache-size: 1GB
performance.client-io-threads: on
performance.write-behind-window-size: 4MB
performance.io-thread-count: 8
performance.force-readdirp: on
performance.readdir-ahead: on
cluster.readdir-optimize: on
performance.high-prio-threads: 8
performance.flush-behind: on
performance.write-behind: on
performance.quick-read: off
performance.io-cache: on
performance.read-ahead: off
server.event-threads: 8
cluster.lookup-optimize: on
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: off
performance.md-cache-timeout: 60
network.inode-lru-limit: 90000
diagnostics.brick-log-level: ERROR
diagnostics.brick-sys-log-level: ERROR
diagnostics.client-log-level: ERROR
diagnostics.client-sys-log-level: ERROR
cluster.min-free-disk: 20%
cluster.self-heal-window-size: 16
cluster.self-heal-readdir-size: 1024
cluster.background-self-heal-count: 4
cluster.heal-wait-queue-length: 128
client.event-threads: 8
performance.cache-invalidation: on
nfs.disable: off
nfs.acl: off
cluster.brick-multiplex: disable


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux