Sure, These files are just a sampling -- a lot of other files are showing the same "split-brain" behaviour. [14:42:45][root at web01:/var/glusterfs/bricks/shared]# getfattr -d -m "trusted.afr*" agc/production/log/809223185/contact.log # file: agc/production/log/809223185/contact.log trusted.afr.shared-application-data-client-0=0sAAAAAAAAAAAAAAAA trusted.afr.shared-application-data-client-1=0sAAAABQAAAAAAAAAA [14:45:15][root at web02:/var/glusterfs/bricks/shared]# getfattr -d -m "trusted.afr*" agc/production/log/809223185/contact.log # file: agc/production/log/809223185/contact.log trusted.afr.shared-application-data-client-0=0sAAACOwAAAAAAAAAA trusted.afr.shared-application-data-client-1=0sAAAAAAAAAAAAAAAA [14:42:53][root at web01:/var/glusterfs/bricks/shared]# getfattr -d -m "trusted.afr*" agc/production/log/809223185/event.log # file: agc/production/log/809223185/event.log trusted.afr.shared-application-data-client-0=0sAAAAAAAAAAAAAAAA trusted.afr.shared-application-data-client-1=0sAAAADgAAAAAAAAAA [14:45:24][root at web02:/var/glusterfs/bricks/shared]# getfattr -d -m "trusted.afr*" agc/production/log/809223185/event.log # file: agc/production/log/809223185/event.log trusted.afr.shared-application-data-client-0=0sAAAGXQAAAAAAAAAA trusted.afr.shared-application-data-client-1=0sAAAAAAAAAAAAAAAA [14:43:02][root at web01:/var/glusterfs/bricks/shared]# getfattr -d -m "trusted.afr*" agc/production/log/809223635/contact.log # file: agc/production/log/809223635/contact.log trusted.afr.shared-application-data-client-0=0sAAAAAAAAAAAAAAAA trusted.afr.shared-application-data-client-1=0sAAAACgAAAAAAAAAA [14:45:28][root at web02:/var/glusterfs/bricks/shared]# getfattr -d -m "trusted.afr*" agc/production/log/809223635/contact.log # file: agc/production/log/809223635/contact.log trusted.afr.shared-application-data-client-0=0sAAAELQAAAAAAAAAA trusted.afr.shared-application-data-client-1=0sAAAAAAAAAAAAAAAA [14:43:39][root at web01:/var/glusterfs/bricks/shared]# getfattr -d -m "trusted.afr*" agc/production/log/809224061/contact.log # file: agc/production/log/809224061/contact.log trusted.afr.shared-application-data-client-0=0sAAAAAAAAAAAAAAAA trusted.afr.shared-application-data-client-1=0sAAAACQAAAAAAAAAA [14:45:32][root at web02:/var/glusterfs/bricks/shared]# getfattr -d -m "trusted.afr*" agc/production/log/809224061/contact.log # file: agc/production/log/809224061/contact.log trusted.afr.shared-application-data-client-0=0sAAAD+AAAAAAAAAAA trusted.afr.shared-application-data-client-1=0sAAAAAAAAAAAAAAAA [14:43:42][root at web01:/var/glusterfs/bricks/shared]# getfattr -d -m "trusted.afr*" agc/production/log/809224321/contact.log # file: agc/production/log/809224321/contact.log trusted.afr.shared-application-data-client-0=0sAAAAAAAAAAAAAAAA trusted.afr.shared-application-data-client-1=0sAAAACAAAAAAAAAAA [14:45:37][root at web02:/var/glusterfs/bricks/shared]# getfattr -d -m "trusted.afr*" agc/production/log/809224321/contact.log # file: agc/production/log/809224321/contact.log trusted.afr.shared-application-data-client-0=0sAAAERAAAAAAAAAAA trusted.afr.shared-application-data-client-1=0sAAAAAAAAAAAAAAAA [14:43:45][root at web01:/var/glusterfs/bricks/shared]# getfattr -d -m "trusted.afr*" agc/production/log/809215319/event.log # file: agc/production/log/809215319/event.log trusted.afr.shared-application-data-client-0=0sAAAAAAAAAAAAAAAA trusted.afr.shared-application-data-client-1=0sAAAABwAAAAAAAAAA [14:45:45][root at web02:/var/glusterfs/bricks/shared]# getfattr -d -m "trusted.afr*" agc/production/log/809215319/event.log # file: agc/production/log/809215319/event.log trusted.afr.shared-application-data-client-0=0sAAAC/QAAAAAAAAAA trusted.afr.shared-application-data-client-1=0sAAAAAAAAAAAAAAAA On Wed, May 18, 2011 at 01:31, Pranith Kumar. Karampuri < pranithk at gluster.com> wrote: > hi Remi, > It seems the split-brain is detected on following files: > /agc/production/log/809223185/contact.log > /agc/production/log/809223185/event.log > /agc/production/log/809223635/contact.log > /agc/production/log/809224061/contact.log > /agc/production/log/809224321/contact.log > /agc/production/log/809215319/event.log > > Could you give the output of the following command for each file above on > both the bricks in the replica pair. > > getxattr -d -m "trusted.afr*" <filepath> > > Thanks > Pranith > > ----- Original Message ----- > From: "Remi Broemeling" <remi at goclio.com> > To: gluster-users at gluster.org > Sent: Tuesday, May 17, 2011 9:02:44 PM > Subject: Re: Rebuild Distributed/Replicated Setup > > > Hi Pranith. Sure, here is a pastebin sampling of logs from one of the > hosts: http://pastebin.com/1U1ziwjC > > > On Mon, May 16, 2011 at 20:48, Pranith Kumar. Karampuri < > pranithk at gluster.com > wrote: > > > hi Remi, > Would it be possible to post the logs on the client, so that we can find > what issue you are running into. > > Pranith > > > > ----- Original Message ----- > From: "Remi Broemeling" < remi at goclio.com > > To: gluster-users at gluster.org > Sent: Monday, May 16, 2011 10:47:33 PM > Subject: Rebuild Distributed/Replicated Setup > > > Hi, > > I've got a distributed/replicated GlusterFS v3.1.2 (installed via RPM) > setup across two servers (web01 and web02) with the following vol config: > > volume shared-application-data-client-0 > type protocol/client > option remote-host web01 > option remote-subvolume /var/glusterfs/bricks/shared > option transport-type tcp > option ping-timeout 5 > end-volume > > volume shared-application-data-client-1 > type protocol/client > option remote-host web02 > option remote-subvolume /var/glusterfs/bricks/shared > option transport-type tcp > option ping-timeout 5 > end-volume > > volume shared-application-data-replicate-0 > type cluster/replicate > subvolumes shared-application-data-client-0 > shared-application-data-client-1 > end-volume > > volume shared-application-data-write-behind > type performance/write-behind > subvolumes shared-application-data-replicate-0 > end-volume > > volume shared-application-data-read-ahead > type performance/read-ahead > subvolumes shared-application-data-write-behind > end-volume > > volume shared-application-data-io-cache > type performance/io-cache > subvolumes shared-application-data-read-ahead > end-volume > > volume shared-application-data-quick-read > type performance/quick-read > subvolumes shared-application-data-io-cache > end-volume > > volume shared-application-data-stat-prefetch > type performance/stat-prefetch > subvolumes shared-application-data-quick-read > end-volume > > volume shared-application-data > type debug/io-stats > subvolumes shared-application-data-stat-prefetch > end-volume > > In total, four servers mount this via GlusterFS FUSE. For whatever reason > (I'm really not sure why), the GlusterFS filesystem has run into a bit of > split-brain nightmare (although to my knowledge an actual split brain > situation has never occurred in this environment), and I have been getting > solidly corrupted issues across the filesystem as well as complaints that > the filesystem cannot be self-healed. > > What I would like to do is completely empty one of the two servers (here I > am trying to empty server web01), making the other one (in this case web02) > the authoritative source for the data; and then have web01 completely > rebuild it's mirror directly from web02. > > What's the easiest/safest way to do this? Is there a command that I can run > that will force web01 to re-initialize it's mirror directly from web02 (and > thus completely eradicate all of the split-brain errors and data > inconsistencies)? > > Thanks! > -- Remi Broemeling System Administrator Clio - Practice Management Simplified 1-888-858-2546 x(2^5) | remi at goclio.com www.goclio.com | blog <http://www.goclio.com/blog> | twitter<http://www.twitter.com/goclio> | facebook <http://www.facebook.com/goclio> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://gluster.org/pipermail/gluster-users/attachments/20110518/c2cb81e6/attachment-0001.htm>