Ok, thanks - that certainly helps (a lot!), but what about all these gfid files? Are they files in split-brain or something else? The links don't cover dealing with anything like this :-( My impression is that maybe they're files have haven't replicated and/or haven't been self healed, for whatever reason... > -----Original Message----- > From: Diego Remolina [mailto:dijuremo@xxxxxxxxx] > Sent: 30 October 2015 12:58 > To: Iain Milne > Cc: gluster-users@xxxxxxxxxxx List > Subject: Re: Avoiding Split Brains > > Yes, you need to avoid split brain on a two node replica=2 setup. You can just > add a third node with no bricks which serves as the arbiter and set quorum to > 51%. > > If you set quorum to 51% and do not have more than 2 nodes, then when one > goes down all your gluster mounts become unavailable (or is it just read only?). > If you run VMs on top of this then you usually end up with paused/frozen vms > until the volume becomes available again. > > These are RH specific docs, but may help: > > https://access.redhat.com/documentation/en- > US/Red_Hat_Storage/2.0/html/Administration_Guide/sect-User_Guide- > Managing_Volumes-Quorum.html > > https://access.redhat.com/documentation/en- > US/Red_Hat_Storage/3/html/Administration_Guide/sect-Managing_Split- > brain.html > > First time in testing I hit split brain, I found these blog very useful: > > https://joejulian.name/blog/fixing-split-brain-with-glusterfs-33/ > > HTH, > > Diego > > On Fri, Oct 30, 2015 at 8:46 AM, Iain Milne <glusterfs@xxxxxxxxxxx> wrote: > > Anyone? > > > >> -----Original Message----- > >> From: gluster-users-bounces@xxxxxxxxxxx [mailto:gluster-users- > >> bounces@xxxxxxxxxxx] On Behalf Of Iain Milne > >> Sent: 21 October 2015 09:23 > >> To: gluster-users@xxxxxxxxxxx > >> Subject: Avoiding Split Brains > >> > >> Hi all, > >> > >> We've been running a distributed setup for 3 years with no issues. > >> Recently we switched to a 2-server, replicated setup (soon to be a 4 > >> servers) and keep encountering what I assume are split-brain > >> situations, > >> eg: > >> > >> Brick server1:/brick > >> <gfid:85893940-63a8-4fa3-bf83-9e894fe852c7> > >> <gfid:8b325ef9-a8d2-4088-a8ae-c73f4b9390fc> > >> <gfid:ed815f9b-9a97-4c21-86a1-da203b023cda> > >> <gfid:7fdbd6da-b09d-4eaf-a99b-2fbe889d2c5f> > >> ... > >> Number of entries: 217 > >> > >> Brick server2:/brick > >> Number of entries: 0 > >> > >> a) What does this mean? > >> b) How do I go about fixing it? > >> > >> And perhaps more importantly, how to I avoid this happening in the future? > >> Not once since moving to replication has either of the two servers > >> been > > offline > >> or unavailable (to my knowledge). > >> > >> Is some sort of server/client quorum needed (that I admit I don't > >> fully understand)? While high-availability would be nice to have, > >> it's not > > essential - > >> robustness of the data is. > >> > >> Thanks > >> > >> Iain > >> > >> _______________________________________________ > >> Gluster-users mailing list > >> Gluster-users@xxxxxxxxxxx > >> http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users@xxxxxxxxxxx > > http://www.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users