Re: [gluster] possible split-brain issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Arnold Yang,
       I see that the directories /export/vdb1/brick/, /export/vdb1/brick/mpdis/ are in metadata split-brain. You can follow the document: https://github.com/gluster/glusterfs/blob/master/doc/split-brain.md to fix the split-brain.

export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf1, export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf2 don't seem to be in split-brain as per the extended attributes, could you send the stat of these two files on both the bricks?

Pranith
On 01/23/2015 02:18 PM, Arnold Yang wrote:

Hi Pranith,

 

No worries!

 

Here is the output of the other brick:

 

[root@dmf-wpst-2 ~]# getfattr -d -m. -e hex /export/vdb1/brick/

getfattr: Removing leading '/' from absolute path names

# file: export/vdb1/brick/

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000

trusted.afr.gv0-client-0=0x000000000000001500000000

trusted.afr.gv0-client-1=0x000000000000000000000000

trusted.gfid=0x00000000000000000000000000000001

trusted.glusterfs.dht=0x000000010000000000000000ffffffff

trusted.glusterfs.volume-id=0x51de44c3f01e486da6b710c7b7a270d7

 

 

 

[root@dmf-wpst-2 ~]#  getfattr -d -m. -e hex /export/vdb1/brick/mpdis/

getfattr: Removing leading '/' from absolute path names

# file: export/vdb1/brick/mpdis/

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000

trusted.afr.gv0-client-0=0x000000000000000200000000

trusted.afr.gv0-client-1=0x000000000000000000000000

trusted.gfid=0x8ff7afeb996244cd9d1bf213568398d7

trusted.glusterfs.dht=0x000000010000000000000000ffffffff

 

 

[root@dmf-wpst-2 ~]# getfattr -d -m. -e hex /export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf1

getfattr: Removing leading '/' from absolute path names

# file: export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf1

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000

trusted.afr.gv0-client-0=0x000000000000000000000000

trusted.afr.gv0-client-1=0x000000000000000000000000

trusted.gfid=0x85ed306b179b46819d7c02eb336543b8

 

 

[root@dmf-wpst-2 ~]# getfattr -d -m. -e hex /export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf2

getfattr: Removing leading '/' from absolute path names

# file: export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf2

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000

trusted.afr.gv0-client-0=0x000000000000000000000000

trusted.afr.gv0-client-1=0x000000000000000000000000

trusted.gfid=0xa826a389e7a042c2b5175a1acbecae9b

 

 

 

 

From: Pranith Kumar Karampuri [mailto:pkarampu@xxxxxxxxxx]
Sent: Friday, January 23, 2015 3:42 PM
To: Arnold Yang; Jifeng Li; Gluster-users@xxxxxxxxxxx
Subject: Re: [gluster] possible split-brain issue

 

hi Arnold,
   You gave the output only on one brick it seems? Could you also provide it on other brick as well. Sorry I didn't make that clear in my earlier mail.

Pranith

On 01/23/2015 10:44 AM, Arnold Yang wrote:

Hi Pranith,

 

Here is the output for the commands provide by you, anything more you need, please tell us!

Thanks!

 

[root@dmf-wpst-1 ~]# getfattr -d -m. -e hex /export/vdb1/brick/

getfattr: Removing leading '/' from absolute path names

# file: export/vdb1/brick/

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000

trusted.afr.gv0-client-0=0x000000000000000000000000

trusted.afr.gv0-client-1=0x000000000000001400000000

trusted.gfid=0x00000000000000000000000000000001

trusted.glusterfs.dht=0x000000010000000000000000ffffffff

trusted.glusterfs.volume-id=0x51de44c3f01e486da6b710c7b7a270d7

 

 

[root@dmf-wpst-1 ~]# getfattr -d -m. -e hex /export/vdb1/brick/mpdis/

getfattr: Removing leading '/' from absolute path names

# file: export/vdb1/brick/mpdis/

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000

trusted.afr.gv0-client-0=0x000000000000000000000000

trusted.afr.gv0-client-1=0x000000000000000400000000

trusted.gfid=0x8ff7afeb996244cd9d1bf213568398d7

trusted.glusterfs.dht=0x000000010000000000000000ffffffff

 

[root@dmf-wpst-1 ~]# getfattr -d -m. -e hex /export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf1

getfattr: Removing leading '/' from absolute path names

# file: export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf1

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000

trusted.afr.gv0-client-0=0x000000000000000000000000

trusted.afr.gv0-client-1=0x000000000000000000000000

trusted.gfid=0x85ed306b179b46819d7c02eb336543b8

 

 

[root@dmf-wpst-1 ~]# getfattr -d -m. -e hex /export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf2

getfattr: Removing leading '/' from absolute path names

# file: export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf2

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000

trusted.afr.gv0-client-0=0x000000000000000000000000

trusted.afr.gv0-client-1=0x000000000000000000000000

trusted.gfid=0xa826a389e7a042c2b5175a1acbecae9b

 

 

 

 

From: Pranith Kumar Karampuri [mailto:pkarampu@xxxxxxxxxx]
Sent: Thursday, January 22, 2015 12:14 AM
To: Jifeng Li; Gluster-users@xxxxxxxxxxx; Arnold Yang
Subject: Re: [gluster] possible split-brain issue

 

 

On 01/14/2015 04:48 PM, Jifeng Li wrote:

Hi ,

 

[issue]:  To ensure the glusterFS mount point work, a script will periodically using HTTP put a file to subdirectory under mount point which is used as  Apache DocumentRoot. But after running some time,  some errors show below:

[2015-01-14 09:18:40.915639] E [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status] 0-gv0-replicate-0:  metadata self heal  failed,   on /mpdis

[2015-01-14 09:18:41.924584] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 0-gv0-replicate-0: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix:  [ [ 0 20 ] [ 21 0 ] ]

[2015-01-14 09:18:41.925182] E [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status] 0-gv0-replicate-0:  metadata self heal  failed,   on /

[2015-01-14 09:18:41.934827] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 0-gv0-replicate-0: Unable to self-heal contents of '/mpdis' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix:  [ [ 0 4 ] [ 2 0 ] ]

[2015-01-14 09:18:41.935375] E [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status] 0-gv0-replicate-0:  metadata self heal  failed,   on /mpdis

[2015-01-14 09:18:42.943742] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 0-gv0-replicate-0: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix:  [ [ 0 20 ] [ 21 0 ] ]

[2015-01-14 09:18:42.944432] E [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status] 0-gv0-replicate-0:  metadata self heal  failed,   on /

[2015-01-14 09:18:42.946664] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 0-gv0-replicate-0: Unable to self-heal contents of '/mpdis' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix:  [ [ 0 4 ] [ 2 0 ] ]

[2015-01-14 09:18:42.947323] E [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status] 0-gv0-replicate-0:  metadata self heal  failed,   on /mpdis

[2015-01-14 09:18:43.955929] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 0-gv0-replicate-0: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix:  [ [ 0 20 ] [ 21 0 ] ]

[2015-01-14 09:18:43.956701] E [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status] 0-gv0-replicate-0:  metadata self heal  failed,   on /

[2015-01-14 09:18:43.958874] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 0-gv0-replicate-0: Unable to self-heal contents of '/mpdis' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix:  [ [ 0 4 ] [ 2 0 ] ]

 

Besides, I find  Input/output error shown below  when listing the files of under mount point:

 

[root@dmf-wpst-2 mpdis]# ll

total 0

-rwxr-xr-x. 1 apache apache 0 Jan 14 04:21 test.rep.00.00.00.00.dmf1

-rw-r--r--. 1 apache apache 0 Jan 14 04:21 test.rep.00.00.00.00.dmf2

 

[root@dmf-wpst-2 mpdis]# ll

total 0

-rwxr-xr-x. 1 apache apache 0 Jan 14 04:21 test.rep.00.00.00.00.dmf1

-rw-r--r--. 1 apache apache 0 Jan 14 04:21 test.rep.00.00.00.00.dmf2

 

[root@dmf-wpst-2 mpdis]# ll

ls: cannot open directory .: Input/output error

 

[root@dmf-wpst-2 mpdis]# ll

ls: cannot access test.rep.00.00.00.00.dmf1: Input/output error

ls: cannot access test.rep.00.00.00.00.dmf2: Input/output error

total 0

?????????? ? ? ? ?            ? test.rep.00.00.00.00.dmf1

?????????? ? ? ? ?            ? test.rep.00.00.00.00.dmf2

    Any tips about debugging further or getting this fixed up would be appreciated.  

 

[version]: 3.5.3

[environment]:  two virtual server each has one brick :

 

root@dmf-wpst-2 mpdis]# gluster volume status

Status of volume: gv0

Gluster process                                                                                                Port       Online   Pid

------------------------------------------------------------------------------

Brick dmf-ha-1-glusterfs:/export/vdb1/brick                      49152    Y              332

Brick dmf-ha-2-glusterfs:/export/vdb1/brick                      49154    Y              19396

Self-heal Daemon on localhost                                                  N/A        Y              19410

Self-heal Daemon on 10.175.123.246                                       N/A        Y              999

 

[root@dmf-wpst-1 mpdis]# gluster volume info

Volume Name: gv0

Type: Replicate

Volume ID: 51de44c3-f01e-486d-a6b7-10c7b7a270d7

Status: Started

Number of Bricks: 1 x 2 = 2

Transport-type: tcp

Bricks:

Brick1: dmf-ha-1-glusterfs:/export/vdb1/brick

Brick2: dmf-ha-2-glusterfs:/export/vdb1/brick

Options Reconfigured:

nfs.disable: ON

network.ping-timeout: 2

storage.bd-aio: on

storage.linux-aio: on

cluster.eager-lock: on

performance.client-io-threads: on

performance.cache-refresh-timeout: 60

performance.io-thread-count: 64

performance.cache-size: 8GB

cluster.server-quorum-type: none

 

  [mount-point info]:

1.       mount command

glusterfs -p /var/run/glusterfs.pid --volfile-server=dmf-ha-1-glusterfs --volfile-server=dmf-ha-2-glusterfs --volfile-id=gv0 /dmfcontents

 

 

2.       mount point directory hierarchy                  

[root@dmf-wpst-2 /]# ls -ld /dmfcontents/

drwxr-xr-x. 5 root root 71 Jan 14 04:39 /dmfcontents/

               [root@dmf-wpst-2 /]# ls -ld /dmfcontents/mpdis/

              drwxr-xr-x. 2 apache apache 89 Jan 14 04:39 /dmfcontents/mpdis/

hi Jifeng Li,
     Sorry for the delay in response. Could you post the output of:
'getfattr -d -m. -e hex <brick-path>'
'getfattr -d -m. -e hex <brick-path>/mpdis'
'getfattr -d -m. -e hex <brick-path>/mpdis/test.rep.00.00.00.00.dmf1'
'getfattr -d -m. -e hex <brick-path>/mpdis/test.rep.00.00.00.00.dmf2'

Pranith


 

 

 





_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

 

 


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux