Just FYI, I have errors similar to Remi, on 3.2.0 for certain dirs., same type of dist/repl setup.. Input/Output errors, unable to self-heal, split-brain. Except mine say it's a permissions problem, not size differences, even though the permissions on both backends for the files/dirs show the same. Only way I fixed mine was removing the offending directory and recreating it. -Tony --------------------------- Manager, IT Operations Format Dynamics, Inc. P: 303-228-7327 F: 303-228-7305 abiacco at formatdynamics.com http://www.formatdynamics.com <http://www.formatdynamics.com/> From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Remi Broemeling Sent: Tuesday, May 17, 2011 9:33 AM To: gluster-users at gluster.org Subject: Re: Rebuild Distributed/Replicated Setup Hi Pranith. Sure, here is a pastebin sampling of logs from one of the hosts: http://pastebin.com/1U1ziwjC On Mon, May 16, 2011 at 20:48, Pranith Kumar. Karampuri <pranithk at gluster.com> wrote: hi Remi, Would it be possible to post the logs on the client, so that we can find what issue you are running into. Pranith ----- Original Message ----- From: "Remi Broemeling" <remi at goclio.com> To: gluster-users at gluster.org Sent: Monday, May 16, 2011 10:47:33 PM Subject: Rebuild Distributed/Replicated Setup Hi, I've got a distributed/replicated GlusterFS v3.1.2 (installed via RPM) setup across two servers (web01 and web02) with the following vol config: volume shared-application-data-client-0 type protocol/client option remote-host web01 option remote-subvolume /var/glusterfs/bricks/shared option transport-type tcp option ping-timeout 5 end-volume volume shared-application-data-client-1 type protocol/client option remote-host web02 option remote-subvolume /var/glusterfs/bricks/shared option transport-type tcp option ping-timeout 5 end-volume volume shared-application-data-replicate-0 type cluster/replicate subvolumes shared-application-data-client-0 shared-application-data-client-1 end-volume volume shared-application-data-write-behind type performance/write-behind subvolumes shared-application-data-replicate-0 end-volume volume shared-application-data-read-ahead type performance/read-ahead subvolumes shared-application-data-write-behind end-volume volume shared-application-data-io-cache type performance/io-cache subvolumes shared-application-data-read-ahead end-volume volume shared-application-data-quick-read type performance/quick-read subvolumes shared-application-data-io-cache end-volume volume shared-application-data-stat-prefetch type performance/stat-prefetch subvolumes shared-application-data-quick-read end-volume volume shared-application-data type debug/io-stats subvolumes shared-application-data-stat-prefetch end-volume In total, four servers mount this via GlusterFS FUSE. For whatever reason (I'm really not sure why), the GlusterFS filesystem has run into a bit of split-brain nightmare (although to my knowledge an actual split brain situation has never occurred in this environment), and I have been getting solidly corrupted issues across the filesystem as well as complaints that the filesystem cannot be self-healed. What I would like to do is completely empty one of the two servers (here I am trying to empty server web01), making the other one (in this case web02) the authoritative source for the data; and then have web01 completely rebuild it's mirror directly from web02. What's the easiest/safest way to do this? Is there a command that I can run that will force web01 to re-initialize it's mirror directly from web02 (and thus completely eradicate all of the split-brain errors and data inconsistencies)? Thanks! -- Remi Broemeling System Administrator Clio - Practice Management Simplified 1-888-858-2546 x(2^5) | remi at goclio.com www.goclio.com | blog | twitter | facebook _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users -- Remi Broemeling System Administrator Clio - Practice Management Simplified 1-888-858-2546 x(2^5) | remi at goclio.com www.goclio.com <http://www.goclio.com/> | blog <http://www.goclio.com/blog> | twitter <http://www.twitter.com/goclio> | facebook <http://www.facebook.com/goclio> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://gluster.org/pipermail/gluster-users/attachments/20110517/94f9ae38/attachment.htm>