Hi Jose, Why are all the bricks visible in volume info if the pre-validation for add-brick failed? I suspect that the remove brick wasn't done properly. You can provide the cmd_history.log to verify this. Better to get the other log messages. Also I need to know what are the bricks that were actually removed, the command used and its output. On Thu, Apr 26, 2018 at 3:47 AM, Jose Sanchez <josesanc@xxxxxxxxxxxx> wrote: > Looking at the logs , it seems that it is trying to add using the same port > was assigned for gluster01ib: > > > Any Ideas?? > > Jose > > > > [2018-04-25 22:08:55.169302] I [MSGID: 106482] > [glusterd-brick-ops.c:447:__glusterd_handle_add_brick] 0-management: > Received add brick req > [2018-04-25 22:08:55.186037] I [run.c:191:runner_log] > (-->/usr/lib64/glusterfs/3.8.15/xlator/mgmt/glusterd.so(+0x33045) > [0x7f5464b9b045] > -->/usr/lib64/glusterfs/3.8.15/xlator/mgmt/glusterd.so(+0xcbd85) > [0x7f5464c33d85] -->/lib64/libglusterfs.so.0(runner_log+0x115) > [0x7f54704cf1e5] ) 0-management: Ran script: > /var/lib/glusterd/hooks/1/add-brick/pre/S28Quota-enable-root-xattr-heal.sh > --volname=scratch --version=1 --volume-op=add-brick > --gd-workdir=/var/lib/glusterd > [2018-04-25 22:08:55.309534] I [MSGID: 106143] > [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick > /gdata/brick1/scratch on port 49152 > [2018-04-25 22:08:55.309659] I [MSGID: 106143] > [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick > /gdata/brick1/scratch.rdma on port 49153 > [2018-04-25 22:08:55.310231] E [MSGID: 106005] > [glusterd-utils.c:4877:glusterd_brick_start] 0-management: Unable to start > brick gluster02ib:/gdata/brick1/scratch > [2018-04-25 22:08:55.310275] E [MSGID: 106074] > [glusterd-brick-ops.c:2493:glusterd_op_add_brick] 0-glusterd: Unable to add > bricks > [2018-04-25 22:08:55.310304] E [MSGID: 106123] > [glusterd-mgmt.c:294:gd_mgmt_v3_commit_fn] 0-management: Add-brick commit > failed. > [2018-04-25 22:08:55.310316] E [MSGID: 106123] > [glusterd-mgmt.c:1427:glusterd_mgmt_v3_commit] 0-management: Commit failed > for operation Add brick on local node > [2018-04-25 22:08:55.310330] E [MSGID: 106123] > [glusterd-mgmt.c:2018:glusterd_mgmt_v3_initiate_all_phases] 0-management: > Commit Op Failed > [2018-04-25 22:09:11.678141] E [MSGID: 106452] > [glusterd-utils.c:6064:glusterd_new_brick_validate] 0-management: Brick: > gluster02ib:/gdata/brick1/scratch not available. Brick may be containing or > be contained by an existing brick > [2018-04-25 22:09:11.678184] W [MSGID: 106122] > [glusterd-mgmt.c:188:gd_mgmt_v3_pre_validate_fn] 0-management: ADD-brick > prevalidation failed. > [2018-04-25 22:09:11.678200] E [MSGID: 106122] > [glusterd-mgmt-handler.c:337:glusterd_handle_pre_validate_fn] 0-management: > Pre Validation failed on operation Add brick > [root@gluster02 glusterfs]# gluster volume status scratch > Status of volume: scratch > Gluster process TCP Port RDMA Port Online Pid > ------------------------------------------------------------------------------ > Brick gluster01ib:/gdata/brick1/scratch 49152 49153 Y > 1819 > Brick gluster01ib:/gdata/brick2/scratch 49154 49155 Y > 1827 > Brick gluster02ib:/gdata/brick1/scratch N/A N/A N N/A > > > > Task Status of Volume scratch > ------------------------------------------------------------------------------ > There are no active volume tasks > > > > [root@gluster02 glusterfs]# > > > > On Apr 25, 2018, at 3:23 PM, Jose Sanchez <josesanc@xxxxxxxxxxxx> wrote: > > Hello Karthik > > > Im having trouble adding the two bricks back online. Any help is > appreciated > > thanks > > > when i try to add-brick command this is what i get > > [root@gluster01 ~]# gluster volume add-brick scratch > gluster02ib:/gdata/brick2/scratch/ > volume add-brick: failed: Pre Validation failed on gluster02ib. Brick: > gluster02ib:/gdata/brick2/scratch not available. Brick may be containing or > be contained by an existing brick > > I have run the following commands and remove the .glusterfs hidden > directories > > [root@gluster02 ~]# setfattr -x trusted.glusterfs.volume-id > /gdata/brick2/scratch/ > setfattr: /gdata/brick2/scratch/: No such attribute > [root@gluster02 ~]# setfattr -x trusted.gfid /gdata/brick2/scratch/ > setfattr: /gdata/brick2/scratch/: No such attribute > [root@gluster02 ~]# > > > this is what I get when I run status and info > > > [root@gluster01 ~]# gluster volume info scratch > > Volume Name: scratch > Type: Distribute > Volume ID: 23f1e4b1-b8e0-46c3-874a-58b4728ea106 > Status: Started > Snapshot Count: 0 > Number of Bricks: 4 > Transport-type: tcp,rdma > Bricks: > Brick1: gluster01ib:/gdata/brick1/scratch > Brick2: gluster01ib:/gdata/brick2/scratch > Brick3: gluster02ib:/gdata/brick1/scratch > Brick4: gluster02ib:/gdata/brick2/scratch > Options Reconfigured: > nfs.disable: on > performance.readdir-ahead: on > [root@gluster01 ~]# > > > [root@gluster02 ~]# gluster volume status scratch > Status of volume: scratch > Gluster process TCP Port RDMA Port Online Pid > ------------------------------------------------------------------------------ > Brick gluster01ib:/gdata/brick1/scratch 49156 49157 Y > 1819 > Brick gluster01ib:/gdata/brick2/scratch 49158 49159 Y > 1827 > Brick gluster02ib:/gdata/brick1/scratch N/A N/A N N/A > Brick gluster02ib:/gdata/brick2/scratch N/A N/A N N/A > > Task Status of Volume scratch > ------------------------------------------------------------------------------ > There are no active volume tasks > > [root@gluster02 ~]# > > > This are the logs files from Gluster ETC > > [2018-04-25 20:56:54.390662] I [MSGID: 106143] > [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick > /gdata/brick1/scratch on port 49152 > [2018-04-25 20:56:54.390798] I [MSGID: 106143] > [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick > /gdata/brick1/scratch.rdma on port 49153 > [2018-04-25 20:56:54.391401] E [MSGID: 106005] > [glusterd-utils.c:4877:glusterd_brick_start] 0-management: Unable to start > brick gluster02ib:/gdata/brick1/scratch > [2018-04-25 20:56:54.391457] E [MSGID: 106074] > [glusterd-brick-ops.c:2493:glusterd_op_add_brick] 0-glusterd: Unable to add > bricks > [2018-04-25 20:56:54.391476] E [MSGID: 106123] > [glusterd-mgmt.c:294:gd_mgmt_v3_commit_fn] 0-management: Add-brick commit > failed. > [2018-04-25 20:56:54.391490] E [MSGID: 106123] > [glusterd-mgmt-handler.c:603:glusterd_handle_commit_fn] 0-management: commit > failed on operation Add brick > [2018-04-25 20:58:55.332262] I [MSGID: 106499] > [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management: > Received status volume req for volume scratch > [2018-04-25 21:02:07.464357] E [MSGID: 106452] > [glusterd-utils.c:6064:glusterd_new_brick_validate] 0-management: Brick: > gluster02ib:/gdata/brick1/scratch not available. Brick may be containing or > be contained by an existing brick > [2018-04-25 21:02:07.464395] W [MSGID: 106122] > [glusterd-mgmt.c:188:gd_mgmt_v3_pre_validate_fn] 0-management: ADD-brick > prevalidation failed. > [2018-04-25 21:02:07.464414] E [MSGID: 106122] > [glusterd-mgmt-handler.c:337:glusterd_handle_pre_validate_fn] 0-management: > Pre Validation failed on operation Add brick > [2018-04-25 21:04:56.198662] E [MSGID: 106452] > [glusterd-utils.c:6064:glusterd_new_brick_validate] 0-management: Brick: > gluster02ib:/gdata/brick2/scratch not available. Brick may be containing or > be contained by an existing brick > [2018-04-25 21:04:56.198700] W [MSGID: 106122] > [glusterd-mgmt.c:188:gd_mgmt_v3_pre_validate_fn] 0-management: ADD-brick > prevalidation failed. > [2018-04-25 21:04:56.198716] E [MSGID: 106122] > [glusterd-mgmt-handler.c:337:glusterd_handle_pre_validate_fn] 0-management: > Pre Validation failed on operation Add brick > [2018-04-25 21:07:11.084205] I [MSGID: 106482] > [glusterd-brick-ops.c:447:__glusterd_handle_add_brick] 0-management: > Received add brick req > [2018-04-25 21:07:11.087682] E [MSGID: 106452] > [glusterd-utils.c:6064:glusterd_new_brick_validate] 0-management: Brick: > gluster02ib:/gdata/brick2/scratch not available. Brick may be containing or > be contained by an existing brick > [2018-04-25 21:07:11.087716] W [MSGID: 106122] > [glusterd-mgmt.c:188:gd_mgmt_v3_pre_validate_fn] 0-management: ADD-brick > prevalidation failed. > [2018-04-25 21:07:11.087729] E [MSGID: 106122] > [glusterd-mgmt.c:884:glusterd_mgmt_v3_pre_validate] 0-management: Pre > Validation failed for operation Add brick on local node > [2018-04-25 21:07:11.087741] E [MSGID: 106122] > [glusterd-mgmt.c:2009:glusterd_mgmt_v3_initiate_all_phases] 0-management: > Pre Validation Failed > [2018-04-25 21:12:22.340221] E [MSGID: 106452] > [glusterd-utils.c:6064:glusterd_new_brick_validate] 0-management: Brick: > gluster02ib:/gdata/brick2/scratch not available. Brick may be containing or > be contained by an existing brick > [2018-04-25 21:12:22.340259] W [MSGID: 106122] > [glusterd-mgmt.c:188:gd_mgmt_v3_pre_validate_fn] 0-management: ADD-brick > prevalidation failed. > [2018-04-25 21:12:22.340274] E [MSGID: 106122] > [glusterd-mgmt-handler.c:337:glusterd_handle_pre_validate_fn] 0-management: > Pre Validation failed on operation Add brick > [2018-04-25 21:18:13.427036] I [MSGID: 106499] > [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management: > Received status volume req for volume scratch > [root@gluster02 glusterfs]# > > > --------------------------------- > Jose Sanchez > Systems/Network Analyst 1 > Center of Advanced Research Computing > 1601 Central Ave. > MSC 01 1190 > Albuquerque, NM 87131-0001 > carc.unm.edu > 575.636.4232 > > On Apr 12, 2018, at 12:11 AM, Karthik Subrahmanya <ksubrahm@xxxxxxxxxx> > wrote: > > > > On Wed, Apr 11, 2018 at 7:38 PM, Jose Sanchez <josesanc@xxxxxxxxxxxx> wrote: >> >> Hi Karthik >> >> Looking at the information you have provided me, I would like to make sure >> that I’m running the right commands. >> >> 1. gluster volume heal scratch info > > If the count is non zero, trigger the heal and wait for heal info count to > become zero. >> >> 2. gluster volume remove-brick scratch replica 1 >> gluster02ib:/gdata/brick1/scratch gluster02ib:/gdata/brick2/scratch force >> >> 3. gluster volume add-brick “#" scratch gluster02ib:/gdata/brick1/scratch >> gluster02ib:/gdata/brick2/scratch >> >> >> Based on the configuration I have, Brick 1 from Node A and B are tide >> together and Brick 2 from Node A and B are also tide together. Looking at >> your remove command (step #2), it seems that you want me to remove Brick 1 >> and 2 from Node B (gluster02ib). is that correct? I thought the data was >> distributed in bricks 1 between nodes A and B) and duplicated on Bricks 2 >> (node A and B). > > Data is duplicated between bricks 1 of nodes A & B and bricks 2 of nodes A & > B and data is distributed between these two pairs. > You need not always remove the bricks 1 & 2 from node B itself. The idea > here is to keep one copy from both the replica pairs. >> >> >> Also when I add the bricks back to gluster, do I need to specify if it is >> distributed or replicated?? and Do i need a configuration #?? for example on >> your command (Step #2) you have “replica 1” when remove bricks, do I need to >> do the same when adding the nodes back ? > > No. You just need to erase the data on those bricks and add those bricks > back to the volume. The previous remove-brick command will make the volume > plain distribute. Then simply adding the bricks without specifying any "#" > will expand the volume as a plain distribute volue. >> >> >> Im planning on moving with this changes in few days. At this point each >> brick has 14tb and adding bricks 1 from node A and B, i have a total of >> 28tb, After doing all the process, (removing and adding bricks) I should be >> able to see a total of 56Tb right ? > > Yes after all these you will have 56TB in total. > After adding the bricks, do volume rebalance, so that the data which were > present previously, will be moved to the correct bricks. > > HTH, > Karthik >> >> >> Thanks >> >> Jose >> >> >> >> >> --------------------------------- >> Jose Sanchez >> Systems/Network Analyst 1 >> Center of Advanced Research Computing >> 1601 Central Ave. >> MSC 01 1190 >> Albuquerque, NM 87131-0001 >> carc.unm.edu >> 575.636.4232 >> >> On Apr 7, 2018, at 8:29 AM, Karthik Subrahmanya <ksubrahm@xxxxxxxxxx> >> wrote: >> >> Hi Jose, >> >> Thanks for providing the volume info. You have 2 subvolumes. Data is >> replicated within the bricks of that subvolumes. >> First one consisting of Node A's brick1 & Node B's brick1 and the second >> one consisting of Node A's brick2 and Node B's brick2. >> You don't have the same data on all the 4 bricks. Data are distributed >> between these two subvolumes. >> To remove the replica you can use the command >> gluster volume remove-brick scratch replica 1 >> gluster02ib:/gdata/brick1/scratch gluster02ib:/gdata/brick2/scratch force >> So you will have one copy of data present from both the distributes. >> Before doing this make sure "gluster volume heal scratch info" value is >> zero. So copies you retain will have the correct data. >> After the remove-brick erase the data from the backend. >> Then you can expand the volume by following the steps at [1]. >> >> [1] >> https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#expanding-volumes >> >> Regards, >> Karthik >> >> On Fri, Apr 6, 2018 at 11:39 PM, Jose Sanchez <josesanc@xxxxxxxxxxxx> >> wrote: >>> >>> Hi Karthik >>> >>> this is our configuration, is 2x2 =4 , they are all replicated , each >>> brick has 14tb. we have 2 nodes A and B, each one with brick 1 and 2. >>> >>> Node A (replicated A1 (14tb) and B1 (14tb) ) same with node B >>> (Replicated A2 (14tb) and B2 (14tb)). >>> >>> Do you think we need to degrade the node first before removing it. i >>> believe the same copy of data is on all 4 bricks, we would like to keep one >>> of them, and add the other bricks as extra space >>> >>> Thanks for your help on this >>> >>> Jose >>> >>> >>> >>> >>> >>> [root@gluster01 ~]# gluster volume info scratch >>> >>> Volume Name: scratch >>> Type: Distributed-Replicate >>> Volume ID: 23f1e4b1-b8e0-46c3-874a-58b4728ea106 >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 2 x 2 = 4 >>> Transport-type: tcp,rdma >>> Bricks: >>> Brick1: gluster01ib:/gdata/brick1/scratch >>> Brick2: gluster02ib:/gdata/brick1/scratch >>> Brick3: gluster01ib:/gdata/brick2/scratch >>> Brick4: gluster02ib:/gdata/brick2/scratch >>> Options Reconfigured: >>> performance.readdir-ahead: on >>> nfs.disable: on >>> >>> [root@gluster01 ~]# gluster volume status all >>> Status of volume: scratch >>> Gluster process TCP Port RDMA Port Online >>> Pid >>> >>> ------------------------------------------------------------------------------ >>> Brick gluster01ib:/gdata/brick1/scratch 49152 49153 Y >>> 1743 >>> Brick gluster02ib:/gdata/brick1/scratch 49156 49157 Y >>> 1732 >>> Brick gluster01ib:/gdata/brick2/scratch 49154 49155 Y >>> 1738 >>> Brick gluster02ib:/gdata/brick2/scratch 49158 49159 Y >>> 1733 >>> Self-heal Daemon on localhost N/A N/A Y >>> 1728 >>> Self-heal Daemon on gluster02ib N/A N/A Y >>> 1726 >>> >>> Task Status of Volume scratch >>> >>> ------------------------------------------------------------------------------ >>> There are no active volume tasks >>> >>> --------------------------------- >>> Jose Sanchez >>> Systems/Network Analyst 1 >>> Center of Advanced Research Computing >>> 1601 Central Ave. >>> MSC 01 1190 >>> Albuquerque, NM 87131-0001 >>> carc.unm.edu >>> 575.636.4232 >>> >>> On Apr 6, 2018, at 3:49 AM, Karthik Subrahmanya <ksubrahm@xxxxxxxxxx> >>> wrote: >>> >>> Hi Jose, >>> >>> By switching into pure distribute volume you will lose availability if >>> something goes bad. >>> >>> I am guessing you have a nX2 volume. >>> If you want to preserve one copy of the data in all the distributes, you >>> can do that by decreasing the replica count in the remove-brick operation. >>> If you have any inconsistency, heal them first using the "gluster volume >>> heal <volname>" command and wait till the >>> "gluster volume heal <volname> info" output becomes zero, before removing >>> the bricks, so that you will have the correct data. >>> If you do not want to preserve the data then you can directly remove the >>> bricks. >>> Even after removing the bricks the data will be present in the backend of >>> the removed bricks. You have to manually erase them (both data and >>> .glusterfs folder). >>> See [1] for more details on remove-brick. >>> >>> [1]. >>> https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#shrinking-volumes >>> >>> HTH, >>> Karthik >>> >>> >>> On Thu, Apr 5, 2018 at 8:17 PM, Jose Sanchez <josesanc@xxxxxxxxxxxx> >>> wrote: >>>> >>>> >>>> We have a Gluster setup with 2 nodes (distributed replication) and we >>>> would like to switch it to the distributed mode. I know the data is >>>> duplicated between those nodes, what is the proper way of switching it to a >>>> distributed, we would like to double or gain the storage space on our >>>> gluster storage node. what happens with the data, do i need to erase one of >>>> the nodes? >>>> >>>> Jose >>>> >>>> >>>> --------------------------------- >>>> Jose Sanchez >>>> Systems/Network Analyst >>>> Center of Advanced Research Computing >>>> 1601 Central Ave. >>>> MSC 01 1190 >>>> Albuquerque, NM 87131-0001 >>>> carc.unm.edu >>>> 575.636.4232 >>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users@xxxxxxxxxxx >>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> >> >> > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://lists.gluster.org/mailman/listinfo/gluster-users -- Regards, Hari Gowtham. _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users