frequent split-brain detected, aborting selfheal; background meta-data self-heal failed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm seeing rather frequent (several times per minute) log entries like:

[2013-01-08 16:45:03.399791] I [afr-common.c:1038:afr_launch_self_heal] 0-shared-replicate-0: background  meta-data self-heal triggered. path: /lfd/techstudiolfc/pub
[2013-01-08 16:45:03.400224] I [afr-self-heal-common.c:705:afr_mark_sources] 0-shared-replicate-0: split-brain possible, no source detected
[2013-01-08 16:45:03.400253] E [afr-self-heal-metadata.c:512:afr_sh_metadata_fix] 0-shared-replicate-0: Unable to self-heal permissions/ownership of '/lfd/techstudiolfc/pub' (possible split-brain). Please fix the file on all backend volumes
[2013-01-08 16:45:03.400417] I [afr-self-heal-metadata.c:81:afr_sh_metadata_done] 0-shared-replicate-0: split-brain detected, aborting selfheal of /lfd/techstudiolfc/pub
[2013-01-08 16:45:03.400453] E [afr-self-heal-common.c:2074:afr_self_heal_completion_cbk] 0-shared-replicate-0: background  meta-data self-heal failed on /lfd/techstudiolfc/pub


However, when checking the affected directory - the permissions/ownerships seem to be identical on both servers:

[root at ca1.sg1 /]# ls -ld /data/gluster/lfd/techstudiolfc/pub

drwxr-xr-x 2 userftp userftp 4096 Jun  6  2012 /data/gluster/lfd/techstudiolfc/pub



[root at ca1.sg1 /]# attr -l /data/gluster/lfd/techstudiolfc/pub

Attribute "gfid" has a 16 byte value for /data/gluster/lfd/techstudiolfc/pub

Attribute "afr.shared-client-0" has a 12 byte value for /data/gluster/lfd/techstudiolfc/pub

Attribute "afr.shared-client-1" has a 12 byte value for /data/gluster/lfd/techstudiolfc/pub





[root at ca2.sg1 /]# ls -ld /data/gluster/lfd/techstudiolfc/pub

drwxr-xr-x 2 userftp userftp 4096 Jun  6  2012 /data/gluster/lfd/techstudiolfc/pub



[root at ca2.sg1 /]# attr -l /data/gluster/lfd/techstudiolfc/pub

Attribute "gfid" has a 16 byte value for /data/gluster/lfd/techstudiolfc/pub

Attribute "afr.shared-client-0" has a 12 byte value for /data/gluster/lfd/techstudiolfc/pub

Attribute "afr.shared-client-1" has a 12 byte value for /data/gluster/lfd/techstudiolfc/pub


What could be the problem?

I'm using glusterfs 3.2.6 on Debian Squeeze, and seeing the very same problem on different servers.
It only seem to affect directories.

-- 
Tomasz Chmielewski
http://wpkg.org


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux