Split brain errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hi Mohit,
         Do you know what exact steps are leading to this problem?.

Pranith.

----- Original Message -----
From: "Mohit Anchlia" <mohitanchlia at gmail.com>
To: "Amar Tumballi" <amar at gluster.com>, gluster-users at gluster.org
Sent: Friday, April 29, 2011 9:49:33 PM
Subject: Re: Split brain errors

Can someone from dev please help reply? Should I open a bug?

On Thu, Apr 28, 2011 at 2:17 PM, Mohit Anchlia <mohitanchlia at gmail.com> wrote:
> I got some help and fixed these by setting xattr. For eg changed "I"
> to "A" using setfattr.
>
> But now my next question is why did this happen at first place and
> what measure needs to be taken so that this doesn't happen? It keeps
> happening even if I start clean, stop vol, delete vol, delete
> contents, re-create vols. Also, some of the bricks don't have
> "stress-volume" attr.
>
>
>
> getfattr -dm - /data1/gluster
> getfattr: Removing leading '/' from absolute path names
> # file: data1/gluster
> trusted.afr.stress-volume-client-8=0sAAAAAAIAAAAAAAAA
> trusted.afr.stress-volume-client-9=0sAAAAAAAAAAAAAAAA
> trusted.gfid=0sAAAAAAAAAAAAAAAAAAAAAQ==
> trusted.glusterfs.dht=0sAAAAAQAAAAAqqqqqP////g==
> trusted.glusterfs.test="working\000"
>
>
> On Thu, Apr 28, 2011 at 9:24 AM, Mohit Anchlia <mohitanchlia at gmail.com> wrote:
>> I create 30K directories in the client mountpoint. But I've done this
>> test with mkfs -I 256 and with default 128 byte (Red hat 5.6). Only
>> when I create mkfs -I 256 I see these errors. Looks like the reason
>> for the failure because otherwise everything else is same. Same no of
>> bricks, servers, user (root) etc.
>>
>> I run the stress test and client mount logs are full with these errors
>> for every subvolume. Looks like it's happening for every file that's
>> being writen
>>
>> On Thu, Apr 28, 2011 at 9:20 AM, Amar Tumballi <amar at gluster.com> wrote:
>>> I am seeing the directory size to be different here. Let me confirm if we
>>> are checking extra for size to be same also (for directories it will not be
>>> needed). In that case, this log makes sense, but surely that is a false
>>> positive.
>>> -Amar
>>>
>>> On Thu, Apr 28, 2011 at 9:44 PM, Mohit Anchlia <mohitanchlia at gmail.com>
>>> wrote:
>>>>
>>>> Yes they are the same. It looks like this problem appears only when I
>>>> use -I 256 when creating mkfs. Why would that be?
>>>>
>>>> [root at dsdb1 ~]# ls -ltr /data/
>>>> total 5128
>>>> drwx------ ? ? 2 root root ? 16384 Apr 27 16:57 lost+found
>>>> drwxr-xr-x 30003 root root 4562944 Apr 27 17:15 mnt-stress
>>>> drwxr-xr-x 30003 root root ?598016 Apr 27 17:15 gluster
>>>> [root at dsdb1 ~]# ls -ltr /data1/
>>>> total 572
>>>> drwx------ ? ? 2 root root ?16384 Apr 27 16:59 lost+found
>>>> drwxr-xr-x 30003 root root 561152 Apr 27 17:15 gluster
>>>>
>>>> [root at dsdb2 ~]# ls -ltr /data
>>>> total 588
>>>> drwx------ ? ? 2 root root ?16384 Apr 27 16:52 lost+found
>>>> drwxr-xr-x ? ? 2 root root ? 4096 Apr 27 17:09 mnt-stress
>>>> drwxr-xr-x 30003 root root 573440 Apr 27 17:15 gluster
>>>> [root at dsdb2 ~]# ls -ltr /data1
>>>> total 592
>>>> drwx------ ? ? 2 root root ?16384 Apr 27 16:54 lost+found
>>>> drwxr-xr-x 30003 root root 581632 Apr 27 17:15 gluster
>>>>
>>>>
>>>> On Wed, Apr 27, 2011 at 11:18 PM, Amar Tumballi <amar at gluster.com> wrote:
>>>> >>
>>>> >> [2011-04-27 17:11:29.13142] E
>>>> >> [afr-self-heal-metadata.c:524:afr_sh_metadata_fix]
>>>> >> 0-stress-volume-replicate-0: Unable to self-heal permissions/ownership
>>>> >> of '/' (possible split-brain). Please fix the file on all backend
>>>> >> volumes
>>>> >>
>>>> >> Can someone please help me reason for this problem?
>>>> >>
>>>> >> ?gluster volume info all
>>>> >>
>>>> >> Volume Name: stress-volume
>>>> >> Type: Distributed-Replicate
>>>> >> Status: Started
>>>> >> Number of Bricks: 8 x 2 = 16
>>>> >> Transport-type: tcp
>>>> >> Bricks:
>>>> >> Brick1: dsdb1:/data/gluster
>>>> >> Brick2: dsdb2:/data/gluster
>>>> >
>>>> > Did you check the permission/ownership of these exports? Please make
>>>> > sure
>>>> > that they are same.
>>>> > Regards,
>>>> > Amar
>>>> >
>>>> >>
>>>> >> Brick3: dsdb3:/data/gluster
>>>> >> Brick4: dsdb4:/data/gluster
>>>> >> Brick5: dsdb5:/data/gluster
>>>> >> Brick6: dsdb6:/data/gluster
>>>> >> Brick7: dslg1:/data/gluster
>>>> >> Brick8: dslg2:/data/gluster
>>>> >> Brick9: dsdb1:/data1/gluster
>>>> >> Brick10: dsdb2:/data1/gluster
>>>> >> Brick11: dsdb3:/data1/gluster
>>>> >> Brick12: dsdb4:/data1/gluster
>>>> >> Brick13: dsdb5:/data1/gluster
>>>> >> Brick14: dsdb6:/data1/gluster
>>>> >> Brick15: dslg1:/data1/gluster
>>>> >> Brick16: dslg2:/data1/gluster
>>>> >
>>>> >
>>>> >
>>>
>>>
>>
>
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux