But in the vast majority of cases I'm not seeing specific paths to split-brained files. All I get is a big list of GFIDs with one or two human-readable paths dotted in there (that weren't there when I first posted a week ago). How do I go from a GFID to a file I can identify? gluster volume heal <vol-name> info Brick server1:/brick <gfid:85893940-63a8-4fa3-bf83-9e894fe852c7> <gfid:8b325ef9-a8d2-4088-a8ae-c73f4b9390fc> <gfid:ed815f9b-9a97-4c21-86a1-da203b023cda> /some/path/to/a/known/file <- that only seems to exist on one server <gfid:7fdbd6da-b09d-4eaf-a99b-2fbe889d2c5f> ... Number of entries: 217 Brick server2:/brick Number of entries: 0 and gluster volume heal <vol-name> info split-brain Brick server1:/brick Number of entries in split-brain: 0 Brick server2:/brick Number of entries in split-brain: 0 ?? > -----Original Message----- > From: Diego Remolina [mailto:dijuremo@xxxxxxxxx] > Sent: 30 October 2015 14:29 > To: Iain Milne > Cc: gluster-users@xxxxxxxxxxx List > Subject: Re: Avoiding Split Brains > > Read carefully the blog from JoeJulian, it tells you how to identify and clear the > files in split brain. Make sure you have good backups prior to erasing anything. > > https://joejulian.name/blog/fixing-split-brain-with-glusterfs-33/ > > He even provides a script. > > Diego > > On Fri, Oct 30, 2015 at 10:08 AM, Iain Milne <glusterfs@xxxxxxxxxxx> wrote: > > Ok, thanks - that certainly helps (a lot!), but what about all these > > gfid files? Are they files in split-brain or something else? The links > > don't cover dealing with anything like this :-( > > > > My impression is that maybe they're files have haven't replicated > > and/or haven't been self healed, for whatever reason... > > > >> -----Original Message----- > >> From: Diego Remolina [mailto:dijuremo@xxxxxxxxx] > >> Sent: 30 October 2015 12:58 > >> To: Iain Milne > >> Cc: gluster-users@xxxxxxxxxxx List > >> Subject: Re: Avoiding Split Brains > >> > >> Yes, you need to avoid split brain on a two node replica=2 setup. You > > can just > >> add a third node with no bricks which serves as the arbiter and set > > quorum to > >> 51%. > >> > >> If you set quorum to 51% and do not have more than 2 nodes, then when > >> one goes down all your gluster mounts become unavailable (or is it > >> just read > > only?). > >> If you run VMs on top of this then you usually end up with > >> paused/frozen > > vms > >> until the volume becomes available again. > >> > >> These are RH specific docs, but may help: > >> > >> https://access.redhat.com/documentation/en- > >> US/Red_Hat_Storage/2.0/html/Administration_Guide/sect-User_Guide- > >> Managing_Volumes-Quorum.html > >> > >> https://access.redhat.com/documentation/en- > >> US/Red_Hat_Storage/3/html/Administration_Guide/sect-Managing_Split- > >> brain.html > >> > >> First time in testing I hit split brain, I found these blog very useful: > >> > >> https://joejulian.name/blog/fixing-split-brain-with-glusterfs-33/ > >> > >> HTH, > >> > >> Diego > >> > >> On Fri, Oct 30, 2015 at 8:46 AM, Iain Milne <glusterfs@xxxxxxxxxxx> wrote: > >> > Anyone? > >> > > >> >> -----Original Message----- > >> >> From: gluster-users-bounces@xxxxxxxxxxx [mailto:gluster-users- > >> >> bounces@xxxxxxxxxxx] On Behalf Of Iain Milne > >> >> Sent: 21 October 2015 09:23 > >> >> To: gluster-users@xxxxxxxxxxx > >> >> Subject: Avoiding Split Brains > >> >> > >> >> Hi all, > >> >> > >> >> We've been running a distributed setup for 3 years with no issues. > >> >> Recently we switched to a 2-server, replicated setup (soon to be a > >> >> 4 > >> >> servers) and keep encountering what I assume are split-brain > >> >> situations, > >> >> eg: > >> >> > >> >> Brick server1:/brick > >> >> <gfid:85893940-63a8-4fa3-bf83-9e894fe852c7> > >> >> <gfid:8b325ef9-a8d2-4088-a8ae-c73f4b9390fc> > >> >> <gfid:ed815f9b-9a97-4c21-86a1-da203b023cda> > >> >> <gfid:7fdbd6da-b09d-4eaf-a99b-2fbe889d2c5f> > >> >> ... > >> >> Number of entries: 217 > >> >> > >> >> Brick server2:/brick > >> >> Number of entries: 0 > >> >> > >> >> a) What does this mean? > >> >> b) How do I go about fixing it? > >> >> > >> >> And perhaps more importantly, how to I avoid this happening in the > > future? > >> >> Not once since moving to replication has either of the two servers > >> >> been > >> > offline > >> >> or unavailable (to my knowledge). > >> >> > >> >> Is some sort of server/client quorum needed (that I admit I don't > >> >> fully understand)? While high-availability would be nice to have, > >> >> it's not > >> > essential - > >> >> robustness of the data is. > >> >> > >> >> Thanks > >> >> > >> >> Iain _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users