distribute remove-brick has started migrating the wrong brick (glusterfs 3.8.13)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I requested a brick be removed from a distribute only volume and it seems to be migrating data from the wrong brick... unless I am reading this wrong which I doubt because the disk usage is definitely decreasing on the wrong brick.

gluster> volume status
Status of volume: video-backup
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.0.0.41:/export/md0/brick           49172     0          Y       5306 
Brick 10.0.0.42:/export/md0/brick           49172     0          Y       3651 
Brick 10.0.0.43:/export/md0/brick           49155     0          Y       2826 
Brick 10.0.0.41:/export/md1/brick           49173     0          Y       5311 
Brick 10.0.0.42:/export/md1/brick           49173     0          Y       3656 
Brick 10.0.0.41:/export/md2/brick           49174     0          Y       5316 
Brick 10.0.0.42:/export/md2/brick           49174     0          Y       3662 
Brick 10.0.0.41:/export/md3/brick           49175     0          Y       5322 
Brick 10.0.0.42:/export/md3/brick           49175     0          Y       3667 
Brick 10.0.0.43:/export/md1/brick           49156     0          Y       4836 
 
Task Status of Volume video-backup
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : 7895be7c-4ab9-440d-a301-c11dae0dd9e1
Status               : completed           
 
gluster> volume remove-brick video-backup 10.0.0.43:/export/md1/brick start
volume remove-brick start: success
ID: f666a196-03c2-4940-bd38-45d8383345a4

gluster> volume status 
Status of volume: video-backup
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.0.0.41:/export/md0/brick           49172     0          Y       5306 
Brick 10.0.0.42:/export/md0/brick           49172     0          Y       3651 
Brick 10.0.0.43:/export/md0/brick           49155     0          Y       2826 
Brick 10.0.0.41:/export/md1/brick           49173     0          Y       5311 
Brick 10.0.0.42:/export/md1/brick           49173     0          Y       3656 
Brick 10.0.0.41:/export/md2/brick           49174     0          Y       5316 
Brick 10.0.0.42:/export/md2/brick           49174     0          Y       3662 
Brick 10.0.0.41:/export/md3/brick           49175     0          Y       5322 
Brick 10.0.0.42:/export/md3/brick           49175     0          Y       3667 
Brick 10.0.0.43:/export/md1/brick           49156     0          Y       4836 
 
Task Status of Volume video-backup
------------------------------------------------------------------------------
Task                 : Remove brick        
ID                   : f666a196-03c2-4940-bd38-45d8383345a4
Removed bricks:     
10.0.0.43:/export/md1/brick
Status               : in progress         


But when I check the rebalance log on the host with the brick being removed, it is actually migrating data from the other brick on the same host 10.0.0.43:/export/md0/brick 


.....
[2018-12-11 11:59:52.572657] I [MSGID: 109086] [dht-shared.c:297:dht_parse_decommissioned_bricks] 0-video-backup-dht: decommissioning subvolume video-backup-client-9
....
 29: volume video-backup-client-2
 30:     type protocol/client
 31:     option clnt-lk-version 1
 32:     option volfile-checksum 0
 33:     option volfile-key rebalance/video-backup
 34:     option client-version 3.8.15
 35:     option process-uuid node-dc4-03-25536-2018/12/11-11:59:47:551328-video-backup-client-2-0-0
 36:     option fops-version 1298437
 37:     option ping-timeout 42
 38:     option remote-host 10.0.0.43
 39:     option remote-subvolume /export/md0/brick
 40:     option transport-type socket
 41:     option transport.address-family inet
 42:     option username 9e7fe743-ecd7-40aa-b3db-e112086b2fc7
 43:     option password dab178d6-ecb4-4293-8c1d-6281ec2cafc2
 44: end-volume
...
112: volume video-backup-client-9
113:     type protocol/client
114:     option ping-timeout 42
115:     option remote-host 10.0.0.43
116:     option remote-subvolume /export/md1/brick
117:     option transport-type socket
118:     option transport.address-family inet
119:     option username 9e7fe743-ecd7-40aa-b3db-e112086b2fc7
120:     option password dab178d6-ecb4-4293-8c1d-6281ec2cafc2
121: end-volume
...
[2018-12-11 11:59:52.608698] I [dht-rebalance.c:3668:gf_defrag_start_crawl] 0-video-backup-dht: gf_defrag_start_crawl using commit hash 3766302106
[2018-12-11 11:59:52.609478] I [MSGID: 109081] [dht-common.c:4198:dht_setxattr] 0-video-backup-dht: fixing the layout of /
[2018-12-11 11:59:52.615348] I [MSGID: 0] [dht-rebalance.c:3746:gf_defrag_start_crawl] 0-video-backup-dht: local subvols are video-backup-client-2
[2018-12-11 11:59:52.615378] I [MSGID: 0] [dht-rebalance.c:3746:gf_defrag_start_crawl] 0-video-backup-dht: local subvols are video-backup-client-9
...
[2018-12-11 11:59:52.616554] I [dht-rebalance.c:2652:gf_defrag_process_dir] 0-video-backup-dht: migrate data called on /
[2018-12-11 11:59:54.000363] I [dht-rebalance.c:1230:dht_migrate_file] 0-video-backup-dht: /symlinks.txt: attempting to move from video-backup-client-2 to video-backup-client-4
[2018-12-11 11:59:55.110549] I [MSGID: 109022] [dht-rebalance.c:1703:dht_migrate_file] 0-video-backup-dht: completed migration of /symlinks.txt from subvolume video-backup-client-2 to video-backup-client-4
[2018-12-11 11:59:58.100931] I [MSGID: 109081] [dht-common.c:4198:dht_setxattr] 0-video-backup-dht: fixing the layout of /A6
[2018-12-11 11:59:58.107389] I [dht-rebalance.c:2652:gf_defrag_process_dir] 0-video-backup-dht: migrate data called on /A6
[2018-12-11 11:59:58.132138] I [dht-rebalance.c:2866:gf_defrag_process_dir] 0-video-backup-dht: Migration operation on dir /A6 took 0.02 secs
[2018-12-11 11:59:58.330393] I [MSGID: 109081] [dht-common.c:4198:dht_setxattr] 0-video-backup-dht: fixing the layout of /A6/2017
[2018-12-11 11:59:58.337601] I [dht-rebalance.c:2652:gf_defrag_process_dir] 0-video-backup-dht: migrate data called on /A6/2017
[2018-12-11 11:59:58.493906] I [dht-rebalance.c:1230:dht_migrate_file] 0-video-backup-dht: /A6/2017/57c81ed09f31cd6c1c8990ae20160908101048: attempting to move from video-backup-client-2 to video-backup-client-4
[2018-12-11 11:59:58.706068] I [dht-rebalance.c:1230:dht_migrate_file] 0-video-backup-dht: /A6/2017/57c81ed09f31cd6c1c8990ae20160908120734132317: attempting to move from video-backup-client-2 to video-backup-client-4
[2018-12-11 11:59:58.783952] I [dht-rebalance.c:1230:dht_migrate_file] 0-video-backup-dht: /A6/2017/584a8bcdaca0515f595dff8820161124091841: attempting to move from video-backup-client-2 to video-backup-client-4
[2018-12-11 11:59:58.843315] I [dht-rebalance.c:1230:dht_migrate_file] 0-video-backup-dht: /A6/2017/584a8bcdaca0515f595dff8820161124135453: attempting to move from video-backup-client-2 to video-backup-client-4
[2018-12-11 11:59:58.951637] I [dht-rebalance.c:1230:dht_migrate_file] 0-video-backup-dht: /A6/2017/584a8bcdaca0515f595dff8820161122111252: attempting to move from video-backup-client-2 to video-backup-client-4
[2018-12-11 11:59:59.005324] I [dht-rebalance.c:2866:gf_defrag_process_dir] 0-video-backup-dht: Migration operation on dir /A6/2017 took 0.67 secs
[2018-12-11 11:59:59.005362] I [dht-rebalance.c:1230:dht_migrate_file] 0-video-backup-dht: /A6/2017/58906aaaaca0515f5994104d20170213154555: attempting to move from video-backup-client-2 to video-backup-client-4

etc...

Can I stop/cancel it without data loss? How can I make gluster remove the correct brick? 

Thanks
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux