Re: Avoiding Split Brains

"Iain Milne" <glusterfs@xxxxxxxxxxx> · Fri, 30 Oct 2015 14:08:23 -0000

Ok, thanks - that certainly helps (a lot!), but what about all these gfid
files? Are they files in split-brain or something else? The links don't
cover dealing with anything like this :-(

My impression is that maybe they're files have haven't replicated and/or
haven't been self healed, for whatever reason...

> -----Original Message-----
> From: Diego Remolina [mailto:dijuremo@xxxxxxxxx]
> Sent: 30 October 2015 12:58
> To: Iain Milne
> Cc: gluster-users@xxxxxxxxxxx List
> Subject: Re:  Avoiding Split Brains
>
> Yes, you need to avoid split brain on a two node replica=2 setup. You
can just
> add a third node with no bricks which serves as the arbiter and set
quorum to
> 51%.
>
> If you set quorum to 51% and do not have more than 2 nodes, then when one
> goes down all your gluster mounts become unavailable (or is it just read
only?).
> If you run VMs on top of this then you usually end up with paused/frozen
vms
> until the volume becomes available again.
>
> These are RH specific docs, but may help:
>
> https://access.redhat.com/documentation/en-
> US/Red_Hat_Storage/2.0/html/Administration_Guide/sect-User_Guide-
> Managing_Volumes-Quorum.html
>
> https://access.redhat.com/documentation/en-
> US/Red_Hat_Storage/3/html/Administration_Guide/sect-Managing_Split-
> brain.html
>
> First time in testing I hit split brain, I found these blog very useful:
>
> https://joejulian.name/blog/fixing-split-brain-with-glusterfs-33/
>
> HTH,
>
> Diego
>
> On Fri, Oct 30, 2015 at 8:46 AM, Iain Milne <glusterfs@xxxxxxxxxxx> wrote:
> > Anyone?
> >
> >> -----Original Message-----
> >> From: gluster-users-bounces@xxxxxxxxxxx [mailto:gluster-users-
> >> bounces@xxxxxxxxxxx] On Behalf Of Iain Milne
> >> Sent: 21 October 2015 09:23
> >> To: gluster-users@xxxxxxxxxxx
> >> Subject:  Avoiding Split Brains
> >>
> >> Hi all,
> >>
> >> We've been running a distributed setup for 3 years with no issues.
> >> Recently we switched to a 2-server, replicated setup (soon to be a 4
> >> servers) and keep encountering what I assume are split-brain
> >> situations,
> >> eg:
> >>
> >>     Brick server1:/brick
> >>     <gfid:85893940-63a8-4fa3-bf83-9e894fe852c7>
> >>     <gfid:8b325ef9-a8d2-4088-a8ae-c73f4b9390fc>
> >>     <gfid:ed815f9b-9a97-4c21-86a1-da203b023cda>
> >>     <gfid:7fdbd6da-b09d-4eaf-a99b-2fbe889d2c5f>
> >>     ...
> >>     Number of entries: 217
> >>
> >>     Brick server2:/brick
> >>     Number of entries: 0
> >>
> >> a) What does this mean?
> >> b) How do I go about fixing it?
> >>
> >> And perhaps more importantly, how to I avoid this happening in the
future?
> >> Not once since moving to replication has either of the two servers
> >> been
> > offline
> >> or unavailable (to my knowledge).
> >>
> >> Is some sort of server/client quorum needed (that I admit I don't
> >> fully understand)? While high-availability would be nice to have,
> >> it's not
> > essential -
> >> robustness of the data is.
> >>
> >> Thanks
> >>
> >> Iain
> >>
> >> _______________________________________________
> >> Gluster-users mailing list
> >> Gluster-users@xxxxxxxxxxx
> >> http://www.gluster.org/mailman/listinfo/gluster-users
> >
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users@xxxxxxxxxxx
> > http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users