On Fri, 12 Jan 2018 at 21:16, Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:
---------- Forwarded message ----------
From: Jose Sanchez <josesanc@xxxxxxxxxxxx>
Date: 11 January 2018 at 22:05
Subject: Re: Creating cluster replica on 2 nodes 2 bricks each.
To: Nithya Balachandran <nbalacha@xxxxxxxxxx>
Cc: gluster-users <gluster-users@xxxxxxxxxxx>Hi NithyaThanks for helping me with this, I understand now , but I have few questions.When i had it setup in replica (just 2 nodes with 2 bricks) and tried to added , it failed.[root@gluster01 ~]# gluster volume add-brick scratch replica 2 gluster01ib:/gdata/brick2/scratch gluster02ib:/gdata/brick2/scratchvolume add-brick: failed: /gdata/brick2/scratch is already part of a volumeDid you try the add brick operation several times with the same bricks? If yes, that could be the cause as Gluster sets xattrs on the brick root directory.and after that, I ran the status and info in it and on the status i get just the two brikcsBrick gluster01ib:/gdata/brick1/scratch 49152 49153 Y 3140Brick gluster02ib:/gdata/brick1/scratch 49153 49154 Y 2634and on the info i get all 4 ( 2 x2) is this normal?? behavior?So the brick count does not match for the same volume in the gluster volume status and gluster volume info commands? No, that is not normal.Bricks:Brick1: gluster01ib:/gdata/brick1/scratchBrick2: gluster02ib:/gdata/brick1/scratchBrick3: gluster01ib:/gdata/brick2/scratchBrick4: gluster02ib:/gdata/brick2/scratchNow when i try to mount it , i still get only 14 tb and not 28? Am i doing something wrong? also when I start/stop services, cluster goes back to replicated mode from distributed-replicateIf the fuse mount sees only 2 bricks , that would explain the 14TB.gluster01ib:/scratch 14T 34M 14T 1% /mnt/gluster_test—— Gluster mount log file ——[2018-01-11 16:06:44.963043] I [MSGID: 114046] [client-handshake.c:1216:client_setvolume_cbk] 0-scratch-client-1: Connected to scratch-client-1, attached to remote volume '/gdata/brick1/scratch'.[2018-01-11 16:06:44.963065] I [MSGID: 114047] [client-handshake.c:1227:client_setvolume_cbk] 0-scratch-client-1: Server and Client lk-version numbers are not same, reopening the fds[2018-01-11 16:06:44.968291] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-scratch-client-1: Server lk version = 1[2018-01-11 16:06:44.968404] I [fuse-bridge.c:4147:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.22[2018-01-11 16:06:44.968438] I [fuse-bridge.c:4832:fuse_graph_sync] 0-fuse: switched to graph 0[2018-01-11 16:06:44.969544] I [MSGID: 108031] [afr-common.c:2166:afr_local_discovery_cbk] 0-scratch-replicate-0: selecting local read_child scratch-client-0—— CLI Log File ——[root@gluster01 glusterfs]# tail cli.log[2018-01-11 15:54:14.468122] I [socket.c:2403:socket_event_handler] 0-transport: disconnecting now[2018-01-11 15:54:14.468737] I [cli-rpc-ops.c:817:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0[2018-01-11 15:54:14.469462] I [cli-rpc-ops.c:817:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0[2018-01-11 15:54:14.469530] I [input.c:31:cli_batch] 0-: Exiting with: 0[2018-01-11 16:03:40.422568] I [cli.c:728:main] 0-cli: Started running gluster with version 3.8.15[2018-01-11 16:03:40.430195] I [cli-cmd-volume.c:1828:cli_check_gsync_present] 0-: geo-replication not installed[2018-01-11 16:03:40.430492] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1[2018-01-11 16:03:40.430568] I [socket.c:2403:socket_event_handler] 0-transport: disconnecting now[2018-01-11 16:03:40.485256] I [cli-rpc-ops.c:2244:gf_cli_set_volume_cbk] 0-cli: Received resp to set[2018-01-11 16:03:40.485497] I [input.c:31:cli_batch] 0-: Exiting with: 0—— etc-glusterfs-glusterd.vol.log —[2018-01-10 14:59:23.676814] I [MSGID: 106499] [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management: Received status volume req for volume scratch[2018-01-10 15:00:29.516071] I [MSGID: 106488] [glusterd-handler.c:1537:__glusterd_handle_cli_get_volume] 0-management: Received get vol req[2018-01-10 15:01:09.872082] I [MSGID: 106482] [glusterd-brick-ops.c:447:__glusterd_handle_add_brick] 0-management: Received add brick req[2018-01-10 15:01:09.872128] I [MSGID: 106578] [glusterd-brick-ops.c:499:__glusterd_handle_add_brick] 0-management: replica-count is 2[2018-01-10 15:01:09.876763] E [MSGID: 106451] [glusterd-utils.c:6207:glusterd_is_path_in_use] 0-management: /gdata/brick2/scratch is already part of a volume [File exists][2018-01-10 15:01:09.876807] W [MSGID: 106122] [glusterd-mgmt.c:188:gd_mgmt_v3_pre_validate_fn] 0-management: ADD-brick prevalidation failed.[2018-01-10 15:01:09.876822] E [MSGID: 106122] [glusterd-mgmt.c:884:glusterd_mgmt_v3_pre_validate] 0-management: Pre Validation failed for operation Add brick on local node[2018-01-10 15:01:09.876834] E [MSGID: 106122] [glusterd-mgmt.c:2009:glusterd_mgmt_v3_initiate_all_phases] 0-management: Pre Validation Failed[2018-01-10 15:01:16.005881] I [run.c:191:runner_log] (-->/usr/lib64/glusterfs/3.8.15/xlator/mgmt/glusterd.so(+0x33045) [0x7f1066d15045] -->/usr/lib64/glusterfs/3.8.15/xlator/mgmt/glusterd.so(+0xcbd85) [0x7f1066dadd85] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7f10726491e5] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/add-brick/pre/S28Quota-enable-root-xattr-heal.sh --volname=scratch --version=1 --volume-op=add-brick --gd-workdir=/var/lib/glusterd[2018-01-10 15:01:15.982929] E [MSGID: 106451] [glusterd-utils.c:6207:glusterd_is_path_in_use] 0-management: /gdata/brick2/scratch is already part of a volume [File exists][2018-01-10 15:01:16.005959] I [MSGID: 106578] [glusterd-brick-ops.c:1352:glusterd_op_perform_add_bricks] 0-management: replica-count is set 0Atin, is this correct? It looks like it tries to add the bricks even though the prevalidation failed
I’m guessing that if a force option is passed this validation is overruled? But to confirm I need what exact version are you running with?
[2018-01-10 15:01:16.006018] I [MSGID: 106578] [glusterd-brick-ops.c:1362:glusterd_op_perform_add_bricks] 0-management: type is set 0, need to change it[2018-01-10 15:01:16.062001] I [MSGID: 106143] [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick /gdata/brick2/scratch on port 49154[2018-01-10 15:01:16.062137] I [MSGID: 106143] [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick /gdata/brick2/scratch.rdma on port 49155[2018-01-10 15:01:16.062673] E [MSGID: 106005] [glusterd-utils.c:4877:glusterd_brick_start] 0-management: Unable to start brick gluster01ib:/gdata/brick2/scratch[2018-01-10 15:01:16.062715] E [MSGID: 106074] [glusterd-brick-ops.c:2493:glusterd_op_add_brick] 0-glusterd: Unable to add bricks[2018-01-10 15:01:16.062729] E [MSGID: 106123] [glusterd-mgmt.c:294:gd_mgmt_v3_commit_fn] 0-management: Add-brick commit failed.[2018-01-10 15:01:16.062741] E [MSGID: 106123] [glusterd-mgmt.c:1427:glusterd_mgmt_v3_commit] 0-management: Commit failed for operation Add brick on local node[2018-01-10 15:01:16.062754] E [MSGID: 106123] [glusterd-mgmt.c:2018:glusterd_mgmt_v3_initiate_all_phases] 0-management: Commit Op Failed[2018-01-10 15:01:35.914090] I [MSGID: 106499] [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management: Received status volume req for volume scratch[2018-01-10 15:01:15.979236] I [MSGID: 106482] [glusterd-brick-ops.c:447:__glusterd_handle_add_brick] 0-management: Received add brick req[2018-01-10 15:01:15.979250] I [MSGID: 106578] [glusterd-brick-ops.c:499:__glusterd_handle_add_brick] 0-management: replica-count is 2The message "I [MSGID: 106488] [glusterd-handler.c:1537:__glusterd_handle_cli_get_volume] 0-management: Received get vol req" repeated 3 times between [2018-01-10 15:00:29.516071] and [2018-01-10 15:01:39.652014][2018-01-10 16:16:42.776653] I [MSGID: 106488] [glusterd-handler.c:1537:__glusterd_handle_cli_get_volume] 0-management: Received get vol req[2018-01-10 16:16:42.777614] I [MSGID: 106488] [glusterd-handler.c:1537:__glusterd_handle_cli_get_volume] 0-management: Received get vol req[2018-01-11 15:45:09.023393] I [MSGID: 106488] [glusterd-handler.c:1537:__glusterd_handle_cli_get_volume] 0-management: Received get vol req[2018-01-11 15:45:19.916301] I [MSGID: 106499] [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management: Received status volume req for volume scratch[2018-01-11 15:45:09.024217] I [MSGID: 106488] [glusterd-handler.c:1537:__glusterd_handle_cli_get_volume] 0-management: Received get vol req[2018-01-11 15:54:10.172137] I [MSGID: 106499] [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management: Received status volume req for volume scratch[2018-01-11 15:54:14.468529] I [MSGID: 106488] [glusterd-handler.c:1537:__glusterd_handle_cli_get_volume] 0-management: Received get vol req[2018-01-11 15:54:14.469408] I [MSGID: 106488] [glusterd-handler.c:1537:__glusterd_handle_cli_get_volume] 0-management: Received get vol reqThanksJose---------------------------------Jose SanchezCenter of Advanced Research ComputingAlbuquerque, NM 87131On Jan 10, 2018, at 9:02 PM, Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:Hi Jose,Gluster is working as expected. The Distribute-replicated type just means that there are now 2 replica sets and files will be distributed across them.A volume of type Replicate (1xn where n is the number of bricks in the replica set) indicates there is no distribution (all files on the volume will be present on all the bricks in the volume).A volume of type Distributed-Replicate indicates the volume is both distributed (as in files will only be created on one of the replicated sets) and replicated. So in the above example, a file will exist on either Brick1 and Brick2 or Brick3 and Brick4.After the add brick, the volume will have a total capacity of 28TB and store 2 copies of every file. Let me know if that is not what you are looking for.Regards,NithyaOn 10 January 2018 at 20:40, Jose Sanchez <josesanc@xxxxxxxxxxxx> wrote:Hi NithyaThis is what i have so far, I have peer both cluster nodes together as replica, from node 1A and 1B , now when i tried to add it , i get the error that it is already part of a volume. when i run the cluster volume info , i see that has switch to distributed-replica.ThanksJose[root@gluster01 ~]# gluster volume statusStatus of volume: scratchGluster process TCP Port RDMA Port Online Pid------------------------------------------------------------------------------Brick gluster01ib:/gdata/brick1/scratch 49152 49153 Y 3140Brick gluster02ib:/gdata/brick1/scratch 49153 49154 Y 2634Self-heal Daemon on localhost N/A N/A Y 3132Self-heal Daemon on gluster02ib N/A N/A Y 2626Task Status of Volume scratch------------------------------------------------------------------------------There are no active volume tasks[root@gluster01 ~]#[root@gluster01 ~]# gluster volume infoVolume Name: scratchType: ReplicateVolume ID: a6e20f7d-13ed-4293-ab8b-d783d1748246Status: StartedSnapshot Count: 0Number of Bricks: 1 x 2 = 2Transport-type: tcp,rdmaBricks:Brick1: gluster01ib:/gdata/brick1/scratchBrick2: gluster02ib:/gdata/brick1/scratchOptions Reconfigured:performance.readdir-ahead: onnfs.disable: on[root@gluster01 ~]#-------------------------------------[root@gluster01 ~]# gluster volume add-brick scratch replica 2 gluster01ib:/gdata/brick2/scratch gluster02ib:/gdata/brick2/scratchvolume add-brick: failed: /gdata/brick2/scratch is already part of a volume[root@gluster01 ~]# gluster volume statusStatus of volume: scratchGluster process TCP Port RDMA Port Online Pid------------------------------------------------------------------------------Brick gluster01ib:/gdata/brick1/scratch 49152 49153 Y 3140Brick gluster02ib:/gdata/brick1/scratch 49153 49154 Y 2634Self-heal Daemon on gluster02ib N/A N/A Y 2626Self-heal Daemon on localhost N/A N/A Y 3132Task Status of Volume scratch------------------------------------------------------------------------------There are no active volume tasks[root@gluster01 ~]# gluster volume infoVolume Name: scratchType: Distributed-ReplicateVolume ID: a6e20f7d-13ed-4293-ab8b-d783d1748246Status: StartedSnapshot Count: 0Number of Bricks: 2 x 2 = 4Transport-type: tcp,rdmaBricks:Brick1: gluster01ib:/gdata/brick1/scratchBrick2: gluster02ib:/gdata/brick1/scratchBrick3: gluster01ib:/gdata/brick2/scratchBrick4: gluster02ib:/gdata/brick2/scratchOptions Reconfigured:performance.readdir-ahead: onnfs.disable: on[root@gluster01 ~]#--------------------------------Jose SanchezCenter of Advanced Research ComputingAlbuquerque, NM 87131-0001On Jan 9, 2018, at 9:04 PM, Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:Hi,Please let us know what commands you ran so far and the output of the gluster volume info command.Thanks,NithyaOn 9 January 2018 at 23:06, Jose Sanchez <josesanc@xxxxxxxxxxxx> wrote:HelloWe are trying to setup Gluster for our project/scratch storage HPC machine using a replicated mode with 2 nodes, 2 bricks each (14tb each).Our goal is to be able to have a replicated system between node 1 and 2 (A bricks) and add an additional 2 bricks (B bricks) from the 2 nodes. so we can have a total of 28tb replicated mode.Node 1 [ (Brick A) (Brick B) ]Node 2 [ (Brick A) (Brick B) ]--------------------------------------------14Tb + 14Tb = 28TbAt this I was able to create the replica nodes between node 1 and 2 (brick A) but I’ve not been able to add to the replica together, Gluster switches to distributed replica when i add it with only 14Tb.Any help will be appreciated.ThanksJose---------------------------------Jose SanchezCenter of Advanced Research ComputingAlbuquerque, NM 87131
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users
--
- Atin (atinm)
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users