On 02/12/2018 05:02 PM, Niels de Vos wrote:
Hi Ravi,
Last week I was in a discussion about 4-way replication and one arbiter
(5 bricks per set). It seems that it is not possible to create this
configuration through the CLI. What would it take to make this
available?
The most important changes would be in afr write transaction code,
deciding on when to prevent winding of FOPS and when to report FOP cbk
as a failure if quorum is not met and for split-brain avoidance.The
current arbitration logic is mostly written for 2+1, so that would need
some thinking to modify/validate it for the generic n +1 (n being even)
case that you mention.
The idea is to get a high available storage, split over three
datacenters. Two large datacenter have red and blue racks (separated
power supplies, networking etc.) and the smaller datacenter can host the
arbiter brick.
.--------------------------. .--------------------------.
| DC-1 | | DC-2 |
| .---red---. .--blue---. | | .---red---. .--blue---. |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | [b-1] | | [b-2] | |===| | [b-3] | | [b-4] | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| '---------' '---------' | | '---------' '---------' |
'--------------------------' '--------------------------'
\ /
\ /
\ /
.-------------.
| DC-3 |
| .---------. |
| | | |
| | | |
| | [a-1] | |
| | | |
| | | |
| '---------' |
'-------------'
Creating the volume looks like this, and errors out:
# gluster volume create red-blue replica 5 arbiter 1 \
dc1-red-svr1:/bricks/b-1 dc1-blue-svr1:/bricks/b-2 \
dc2-red-svr1:/bricks/b-3 dc2-blue-svr1:/bricks/b-4 \
dc3-svr1:/bricks/a-1
For arbiter configuration, replica count must be 3 and arbiter count
must be 1. The 3rd brick of the replica will be the arbiter
Possibly the thin-arbiter from https://review.gluster.org/19545 could be
a replacement for the 'full' arbiter. But that may require more time to
get stable than the current arbiter?
Thin arbiter is also targeted as a 2 +1 solution, except there is only
one brick that acts as arbiter for all replica sub-volumes in a dist-rep
setup. Also, it won't participate in I/O path in the happy case of all
bricks being up, so the latency of the thin arbiter node can be higher,
unlike normal arbiter which has to be in the trusted storage pool. The
level of granularity (for file availability) is less than normal
arbiter volumes. Details can be found @
https://github.com/gluster/glusterfs/issues/352
Regards,
Ravi
Thanks,
Niels
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel