Re: distributed replicated volume - added bricks

Strahil Nikolov <hunter86_bg@xxxxxxxxx> · Sat, 19 Oct 2019 23:13:51 +0000 (UTC)

        Most probably this means that data on Brick server1:/gluster_bricks/data3       49164     0          Y       4625 
Brick server1:/gluster_bricks/data4       49165     0          Y       4644

is the same and when server1 goes down , you will have no access to the data on this set.
Same should be valid for :
Brick server1:/gluster_bricks/data5       49166     0          Y       5088 
Brick server1:/gluster_bricks/data6       49167     0          Y       5128 
Brick server2:/gluster_bricks/data3       49168     0          Y       22314
Brick server2:/gluster_bricks/data4       49169     0          Y       22345
Brick server2:/gluster_bricks/data5       49170     0          Y       22889
Brick server2:/gluster_bricks/data6       49171     0          Y       22932

I would remove those bricks and add them again this type always specifying one brick from server1 and one from server2 , so each server has a copy of your data.Even if you didn't rebalance yet, there could be some data on those bricks and can take a while till the cluster evacuates the data.

I'm a gluster newbie, so don't take anything I say for granted :
Best Regards,
Strahil Nikolov

                    В събота, 19 октомври 2019 г., 01:40:58 ч. Гринуич+3, Herb Burnswell <herbert.burnswell@xxxxxxxxx> написа:

                All,
We recently added 4 new bricks to an establish distributed replicated volume.  The original volume was created via gdeploy as:

[volume1]
action="">volname=tank
replica_count=2
force=yes
key=performance.parallel-readdir,network.inode-lru-limit,performance.md-cache-timeout,performance.cache-invalidation,performance.stat-prefetch,features.cache-invalidation-timeout,features.cache-invalidation,performance.cache-samba-metadata
value=on,500000,600,on,on,600,on,on
brick_dirs=/gluster_bricks/data1,/gluster_bricks/data2
ignore_errors=no

This created the volume as:

# gluster vol status tank
Status of volume: tank
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick server1:/gluster_bricks/data1       49162     0          Y       20318
Brick server2:/gluster_bricks/data1       49166     0          Y       3432 
Brick server1:/gluster_bricks/data2       49163     0          Y       20323
Brick server2:/gluster_bricks/data2       49167     0          Y       3435
Self-heal Daemon on localhost               N/A       N/A        Y       25874
Self-heal Daemon on server2               N/A       N/A        Y       12536

Task Status of Volume tank
------------------------------------------------------------------------------
There are no active volume tasks

I have read (https://docs.gluster.org/en/latest/Administrator%20Guide/Setting%20Up%20Volumes) that the way one creates distributed replicated volumes is sensitive to replica-sets:

Note: The number of bricks should be a multiple of the replica count for a distributed replicated volume. Also, the order in which bricks are specified has a great effect on data protection. Each replica_count consecutive bricks in the list you give will form a replica set, with all replica sets combined into a volume-wide distribute set. To make sure that replica-set members are not placed on the same node, list the first brick on every server, then the second brick on every server in the same order, and so on.

I just noticed that the way we added the new bricks we did not indicate a replica:

gluster volume add-brick tank server1:/gluster_bricks/data3 server1:/gluster_bricks/data4 force
gluster volume add-brick tank server1:/gluster_bricks/data5 server1:/gluster_bricks/data6 force
gluster volume add-brick tank server2:/gluster_bricks/data3 server2:/gluster_bricks/data4 force
gluster volume add-brick tank server2:/gluster_bricks/data5 server2:/gluster_bricks/data6 force

Which modified the volume as:

# gluster vol status tank
Status of volume: tank
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick server1:/gluster_bricks/data1       49162     0          Y       20318
Brick server2:/gluster_bricks/data1       49166     0          Y       3432 
Brick server1:/gluster_bricks/data2       49163     0          Y       20323
Brick server2:/gluster_bricks/data2       49167     0          Y       3435 
Brick server1:/gluster_bricks/data3       49164     0          Y       4625 
Brick server1:/gluster_bricks/data4       49165     0          Y       4644 
Brick server1:/gluster_bricks/data5       49166     0          Y       5088 
Brick server1:/gluster_bricks/data6       49167     0          Y       5128 
Brick server2:/gluster_bricks/data3       49168     0          Y       22314
Brick server2:/gluster_bricks/data4       49169     0          Y       22345
Brick server2:/gluster_bricks/data5       49170     0          Y       22889
Brick server2:/gluster_bricks/data6       49171     0          Y       22932
Self-heal Daemon on localhost               N/A       N/A        Y       12366
Self-heal Daemon on server2               N/A       N/A        Y       21446

Task Status of Volume tank
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : ec958aee-edbd-4106-b896-97c688fde0e3
Status               : completed

As you can see the added 3,4,5,6 bricks appear differently:

Brick server1:/gluster_bricks/data3       49164     0          Y       4625 
Brick server1:/gluster_bricks/data4       49165     0          Y       4644 
Brick server1:/gluster_bricks/data5       49166     0          Y       5088 
Brick server1:/gluster_bricks/data6       49167     0          Y       5128 
Brick server2:/gluster_bricks/data3       49168     0          Y       22314
Brick server2:/gluster_bricks/data4       49169     0          Y       22345
Brick server2:/gluster_bricks/data5       49170     0          Y       22889
Brick server2:/gluster_bricks/data6       49171     0          Y       22932 

My question is what does this mean for the volume?  Everything appears to be running as expected, but:

- Is there a serious problem with the way the volume is now configured?
- Have we messed up the high availability of the 2 nodes?
- Is there a way to reconfigure the volume to get it to a more optimal state?

Any help is greatly appreciated...

Thanks in advance,

HB
________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users