Re: Quorum in distributed-replicate volume

Dave Sherohman <dave@xxxxxxxxxxxxx> · Tue, 27 Feb 2018 02:10:03 -0600

On Tue, Feb 27, 2018 at 12:00:29PM +0530, Karthik Subrahmanya wrote:
> I will try to explain how you can end up in split-brain even with cluster
> wide quorum:

Yep, the explanation made sense.  I hadn't considered the possibility of
alternating outages.  Thanks!

> > > It would be great if you can consider configuring an arbiter or
> > > replica 3 volume.
> >
> > I can.  My bricks are 2x850G and 4x11T, so I can repurpose the small
> > bricks as arbiters with minimal effect on capacity.  What would be the
> > sequence of commands needed to:
> >
> > 1) Move all data off of bricks 1 & 2
> > 2) Remove that replica from the cluster
> > 3) Re-add those two bricks as arbiters
> >
> > (And did I miss any additional steps?)
> >
> > Unfortunately, I've been running a few months already with the current
> > configuration and there are several virtual machines running off the
> > existing volume, so I'll need to reconfigure it online if possible.
> >
> Without knowing the volume configuration it is difficult to suggest the
> configuration change,
> and since it is a live system you may end up in data unavailability or data
> loss.
> Can you give the output of "gluster volume info <volname>"
> and which brick is of what size.

Volume Name: palantir
Type: Distributed-Replicate
Volume ID: 48379a50-3210-41b4-9a77-ae143c8bcac0
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: saruman:/var/local/brick0/data
Brick2: gandalf:/var/local/brick0/data
Brick3: azathoth:/var/local/brick0/data
Brick4: yog-sothoth:/var/local/brick0/data
Brick5: cthulhu:/var/local/brick0/data
Brick6: mordiggian:/var/local/brick0/data
Options Reconfigured:
features.scrub: Inactive
features.bitrot: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
network.ping-timeout: 1013
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
features.shard: on
cluster.data-self-heal-algorithm: full
storage.owner-uid: 64055
storage.owner-gid: 64055

For brick sizes, saruman/gandalf have

$ df -h /var/local/brick0
Filesystem                   Size  Used Avail Use% Mounted on
/dev/mapper/gandalf-gluster  885G   55G  786G   7% /var/local/brick0

and the other four have

$ df -h /var/local/brick0
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1        11T  254G   11T   3% /var/local/brick0

-- 
Dave Sherohman
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users