Hi,
It sounds like there are a lot of files on the volume which is why the rebalance will take time. What is the current rebalance status for the volume?
Rebalance should not affect volume operations so is there a particular reason why the estimated time is a cause of concern?
Regards,
Nithya
On 30 April 2018 at 13:10, shadowsocks飞飞 <kiwizhang618@xxxxxxxxx> wrote:
I cannot calculate the number of files normallyThrough df -i I got the approximate number of files is 63694442[root@CentOS-73-64-minimal ~]# df -iFilesystem Inodes IUsed IFree IUse% Mounted on/dev/md2 131981312 30901030 101080282 24% /devtmpfs 8192893 435 8192458 1% /devtmpfs 8199799 8029 8191770 1% /dev/shmtmpfs 8199799 1415 8198384 1% /runtmpfs 8199799 16 8199783 1% /sys/fs/cgroup/dev/md3 110067712 29199861 80867851 27% /home/dev/md1 131072 363 130709 1% /bootgluster1:/web 2559860992 63694442 2496166550 3% /webtmpfs 8199799 1 8199798 1% /run/user/0The rebalance log is in the attachmentthe cluster informationgluster volume status web detailStatus of volume: web------------------------------------------------------------ ------------------ Brick : Brick gluster1:/home/export/md3/brick TCP Port : 49154RDMA Port : 0Online : YPid : 16730File System : ext4Device : /dev/md3Mount Options : rw,noatime,nodiratime,nobarrier,data=""> Inode Size : 256Disk Space Free : 239.4GBTotal Disk Space : 1.6TBInode Count : 110067712Free Inodes : 80867992------------------------------------------------------------ ------------------ Brick : Brick gluster1:/export/md2/brickTCP Port : 49155RDMA Port : 0Online : YPid : 16758File System : ext4Device : /dev/md2Mount Options : rw,noatime,nodiratime,nobarrier,data=""> Inode Size : 256Disk Space Free : 589.4GBTotal Disk Space : 1.9TBInode Count : 131981312Free Inodes : 101080484------------------------------------------------------------ ------------------ Brick : Brick gluster2:/home/export/md3/brick TCP Port : 49152RDMA Port : 0Online : YPid : 12556File System : xfsDevice : /dev/md3Mount Options : rw,noatime,nodiratime,attr2,inode64,sunit=1024,swidth=3072, noquota Inode Size : 256Disk Space Free : 10.7TBTotal Disk Space : 10.8TBInode Count : 2317811968Free Inodes : 2314218207Most of the files in the cluster are pictures smaller than 1M2018-04-30 15:16 GMT+08:00 Nithya Balachandran <nbalacha@xxxxxxxxxx>:Hi,This value is an ongoing rough estimate based on the amount of data rebalance has migrated since it started. The values will cange as the rebalance progresses.A few questions:
- How many files/dirs do you have on this volume?
- What is the average size of the files?
- What is the total size of the data on the volume?
Can you send us the rebalance log?Thanks,NithyaOn 30 April 2018 at 10:33, kiwizhang618 <kiwizhang618@xxxxxxxxx> wrote:I met a big problem,the cluster rebalance takes a long time after adding a new nodegluster volume rebalance web statusNode Rebalanced-files size scanned failures skipped status run time in h:m:s--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------localhost 900 43.5MB 2232 0 69 in progress 0:36:49gluster2 1052 39.3MB 4393 0 1052 in progress 0:36:49Estimated time left for rebalance to complete : 9919:44:34volume rebalance: web: successthe rebalance log[glusterfsd.c:2511:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.12.8 (args: /usr/sbin/glusterfs -s localhost --volfile-id rebalance/web --xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes --xlator-option *dht.assert-no-child-down=yes --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=of f --xlator-option *dht.readdir-optimize=on --xlator-option *dht.rebalance-cmd=1 --xlator-option *dht.node-uuid=d47ad89d-7979-4 ede-9aba-e04f020bb4f0 --xlator-option *dht.commit-hash=3610561770 --socket-file /var/run/gluster/gluster-rebal ance-bdef10eb-1c83-410c-8ad3-f e286450004b.sock --pid-file /var/lib/glusterd/vols/web/reb alance/d47ad89d-7979-4ede-9aba -e04f020bb4f0.pid -l /var/log/glusterfs/web-rebalan ce.log) [2018-04-30 04:20:45.100902] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction [2018-04-30 04:20:45.103927] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2018-04-30 04:20:55.191261] E [MSGID: 109039] [dht-common.c:3113:dht_find_local_subvol_cbk] 0-web-dht: getxattr err for dir [No data available] [2018-04-30 04:21:19.783469] E [MSGID: 109023] [dht-rebalance.c:2669:gf_defrag_migrate_single_file] 0-web-dht: Migrate file failed: /2018/02/x187f6596-36ac-45e6-b d7a-019804dfe427.jpg, lookup failed [Stale file handle] The message "E [MSGID: 109039] [dht-common.c:3113:dht_find_local_subvol_cbk] 0-web-dht: getxattr err for dir [No data available]" repeated 2 times between [2018-04-30 04:20:55.191261] and [2018-04-30 04:20:55.193615] the gluster infoVolume Name: webType: DistributeVolume ID: bdef10eb-1c83-410c-8ad3-fe286450004b Status: StartedSnapshot Count: 0Number of Bricks: 3Transport-type: tcpBricks:Brick1: gluster1:/home/export/md3/brick Brick2: gluster1:/export/md2/brickBrick3: gluster2:/home/export/md3/brick Options Reconfigured:nfs.trusted-sync: onnfs.trusted-write: oncluster.rebal-throttle: aggressivefeatures.inode-quota: offfeatures.quota: offcluster.shd-wait-qlength: 1024transport.address-family: inetcluster.lookup-unhashed: autoperformance.cache-size: 1GBperformance.client-io-threads: onperformance.write-behind-window-size: 4MB performance.io-thread-count: 8performance.force-readdirp: onperformance.readdir-ahead: oncluster.readdir-optimize: onperformance.high-prio-threads: 8performance.flush-behind: onperformance.write-behind: onperformance.quick-read: offperformance.io-cache: onperformance.read-ahead: offserver.event-threads: 8cluster.lookup-optimize: onfeatures.cache-invalidation: onfeatures.cache-invalidation-timeout: 600 performance.stat-prefetch: offperformance.md-cache-timeout: 60network.inode-lru-limit: 90000diagnostics.brick-log-level: ERRORdiagnostics.brick-sys-log-level: ERROR diagnostics.client-log-level: ERRORdiagnostics.client-sys-log-level: ERROR cluster.min-free-disk: 20%cluster.self-heal-window-size: 16cluster.self-heal-readdir-size: 1024 cluster.background-self-heal-count: 4 cluster.heal-wait-queue-length: 128 client.event-threads: 8performance.cache-invalidation: on nfs.disable: offnfs.acl: offcluster.brick-multiplex: disable
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users
- References:
- Gluster rebalance taking many years
- From: kiwizhang618
- Re: Gluster rebalance taking many years
- From: Nithya Balachandran
- Re: Gluster rebalance taking many years
- From: shadowsocks飞飞
- Gluster rebalance taking many years
- Prev by Date: Re: New Style Replication in Version 4
- Next by Date: Re: Usage monitoring per user
- Previous by thread: Re: Gluster rebalance taking many years
- Next by thread: Gluster rebalance taking many years
- Index(es):