On 8 March 2017 at 23:34, Jarsulic, Michael [CRI] <mjarsulic@xxxxxxxxxxxxxxxx> wrote:
I am having issues with one of my systems that houses two bricks and want to bring it down for maintenance. I was able to remove the first brick successfully and committed the changes. The second brick is giving me a lot of problems with the rebalance when I try to remove it. It seems like it is stuck somewhere in that process:
# gluster volume remove-brick hpcscratch cri16fs002-ib:/data/brick4/scratch status
Node Rebalanced-files size scanned failures skipped status run time in secs
--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------
localhost 0 0Bytes 522 0 0 in progress 915.00
The rebalance logs show the following error message.
[2017-03-08 17:48:19.329934] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-hpcscratch-dht: fixing the layout of /userx/Ethiopian_imputation
[2017-03-08 17:48:19.329960] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: subvolume 0 (hpcscratch-client-0): 45778954 chunks
[2017-03-08 17:48:19.329968] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: subvolume 1 (hpcscratch-client-1): 45778954 chunks
[2017-03-08 17:48:19.329974] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: subvolume 2 (hpcscratch-client-4): 45778954 chunks
[2017-03-08 17:48:19.329979] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: subvolume 3 (hpcscratch-client-5): 45778954 chunks
[2017-03-08 17:48:19.329983] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: subvolume 4 (hpcscratch-client-7): 45778954 chunks
[2017-03-08 17:48:19.400394] I [MSGID: 109036] [dht-common.c:7869:dht_log_new_layout_for_dir_selfheal] 0-hpcscratch-dht: Setting layout of /userx/Ethiopian_imputation with [Subvol_name: hpcscratch-client-0, Err: -1 , Start: 1052915942 , Stop: 2105831883 , Hash: 1 ], [Subvol_name: hpcscratch-client-1, Err: -1 , Start: 3158747826 , Stop: 4294967295 , Hash: 1 ], [Subvol_name: hpcscratch-client-4, Err: -1 , Start: 0 , Stop: 1052915941 , Hash: 1 ], [Subvol_name: hpcscratch-client-5, Err: -1 , Start: 2105831884 , Stop: 3158747825 , Hash: 1 ], [Subvol_name: hpcscratch-client-7, Err: 22 , Start: 0 , Stop: 0 , Hash: 0 ],
[2017-03-08 17:48:19.480882] I [dht-rebalance.c:2446:gf_defrag_process_dir] 0-hpcscratch-dht: migrate data called on /userx/Ethiopian_imputation
These are not error messages - these are info messages logged when the layout for a directory is being set and can be ignored.
The remove-brick operation is still in progress according to the status. What is it that makes you feel it is stuck? Is there no difference in the status output even after a considerable interval?
Regards,
Nithya
Any suggestions on how I can get this brick out of play and preserve the data?
--
Mike Jarsulic
Sr. HPC Administrator
Center for Research Informatics | University of Chicago
773.702.2066
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users