On 09/25/2013 06:16 AM, Andrew Lau wrote: > That's where I found the 200+ entries > > [ root at hv01 ]gluster volume heal STORAGE info split-brain > Gathering Heal info on volume STORAGE has been successful > > Brick hv01:/data1 > Number of entries: 271 > at path on brick > > 2013-09-25 00:04:29 /6682d31f-39ce-4896-99ef-14e1c9682585/dom_md/ids > 2013-09-25 00:04:29 > /6682d31f-39ce-4896-99ef-14e1c9682585/images/5599c7c7-0c25-459a-9d7d-80190a7c739b/0593d351-2ab1-49cd-a9b6-c94c897ebcc7 > 2013-09-24 23:54:29 <gfid:9c83f7e4-6982-4477-816b-172e4e640566> > 2013-09-24 23:54:29 <gfid:91e98909-c217-417b-a3c1-4cf0f2356e14> > <snip> > > Brick hv02:/data1 > Number of entries: 0 > > When I run the same command on hv02, it will show the reverse (the > other node having 0 entries). > > I remember last time having to delete these files individually on > another split-brain case, but I was hoping there was a better solution > than going through 200+ entries. > While I haven't tried it out myself, Jeff Darcy has written a script (https://github.com/jdarcy/glusterfs/tree/heal-script/extras/heal_script) which helps in automating the process. He has detailed it's usage in his blog post http://hekafs.org/index.php/2012/06/healing-split-brain/ Hope this helps. -Ravi > Cheers. > > > On Wed, Sep 25, 2013 at 10:39 AM, Mohit Anchlia > <mohitanchlia at gmail.com <mailto:mohitanchlia at gmail.com>> wrote: > > What's the output of > |gluster volume heal $VOLUME info ||split||-brain| > > > On Tue, Sep 24, 2013 at 5:33 PM, Andrew Lau <andrew at andrewklau.com > <mailto:andrew at andrewklau.com>> wrote: > > Found the BZ > https://bugzilla.redhat.com/show_bug.cgi?id=960190 - so I > restarted one of the volumes and it seems to have restarted > the all daemons again. > > Self heal started again, but I seem to have split-brain issues > everywhere. There's over 100 different entries on each node, > what's the best way to restore this now? Short of having to > manually go through and delete 200+ files. It looks like a > full split brain as the file sizes on the different nodes are > out of balance by about 100GB or so. > > Any suggestions would be much appreciated! > > Cheers. > > On Tue, Sep 24, 2013 at 10:32 PM, Andrew Lau > <andrew at andrewklau.com <mailto:andrew at andrewklau.com>> wrote: > > Hi, > > Right now, I have a 2x1 replica. Ever since I had to > reinstall one of the gluster servers, there's been issues > with split-brain. The self-heal daemon doesn't seem to be > running on either of the nodes. > > To reinstall the gluster server (the original brick data > was intact but the OS had to be reinstalled) > - Reinstalled gluster > - Copied over the old uuid from backup > - gluster peer probe > - gluster volume sync $othernode all > - mount -t glusterfs localhost:STORAGE /mnt > - find /mnt -noleaf -print0 | xargs --null stat >/dev/null > 2>/var/log/glusterfs/mnt-selfheal.log > > I let it resync and it was working fine, atleast so I > thought. I just came back a few days later to see there's > a miss match in the brick volumes. One is 50GB ahead of > the other. > > # gluster volume heal STORAGE info > Status: self-heal-daemon is not running on > 966456a1-b8a6-4ca8-9da7-d0eb96997cbe > > /var/log/gluster/glustershd.log doesn't seem to have any > recent logs, only those from when the two original gluster > servers were running. > > # gluster volume status > > Self-heal Daemon on localhostN/ANN/A > > Any suggestions would be much appreciated! > > Cheers > Andrew. > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > http://supercolony.gluster.org/mailman/listinfo/gluster-users > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130925/29ad61b9/attachment.html>