Re: after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Thu, Sep 28, 2017 at 12:11 PM, Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou@xxxxxxxxxxxxxxx> wrote:

 

The version I am using is glusterfs 3.6.9

This is a very old version which is EOL. If you can upgrade to any of the supported version (3.10 or 3.12) would be great.
They have many new features, bug fixes & performance improvements. If you can try to reproduce the issue on that would be
very helpful.

Regards,
Karthik

Best regards,
Cynthia
(周琳)

MBB SM HETRAN SW3 MATRIX 

Storage        
Mobile: +86 (0)18657188311

 

From: Karthik Subrahmanya [mailto:ksubrahm@xxxxxxxxxx]
Sent: Thursday, September 28, 2017 2:37 PM


To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou@xxxxxxxxxxxxxxx>
Cc: Gluster-users@xxxxxxxxxxx; gluster-devel@xxxxxxxxxxx
Subject: Re: after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !

 

 

 

On Thu, Sep 28, 2017 at 11:41 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou@xxxxxxxxxxxxxxx> wrote:

Hi,

Thanks for reply!

I’ve checked [1]. But the problem is that there is nothing shown in command “gluster volume heal <volume-name> info”. So these split-entry files could only be detected when app try to visit them.

I can find gfid mismatch for those in-split-brain entries from mount log, however, nothing show in shd log, the shd log does not know those split-brain entries. Because there is nothing in indices/xattrop directory.

I guess it was there before, and then it got cleared by one of the heal process either client side or server side. I wanted to check that by examining the logs.

Which version of gluster you are running by the way?

 

The log is not available right now, when it reproduced, I will provide it to your, Thanks!

Ok.

 

Best regards,
Cynthia
(周琳)

MBB SM HETRAN SW3 MATRIX 

Storage        
Mobile: +86 (0)18657188311

 

From: Karthik Subrahmanya [mailto:ksubrahm@xxxxxxxxxx]
Sent: Thursday, September 28, 2017 2:02 PM
To: Zhou, Cynthia (NSB - CN/Hangzhou) <
cynthia.zhou@xxxxxxxxxxxxxxx>
Cc:
Gluster-users@xxxxxxxxxxx; gluster-devel@xxxxxxxxxxx
Subject: Re: after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !

 

Hi,

To resolve the gfid split-brain you can follow the steps at [1].

Since we don't have the pending markers set on the files, it is not showing in the heal info.
To debug this issue, need some more data from you. Could you provide these things?

1. volume info

2. mount log

3. brick logs

4. shd log

 

May I also know which version of gluster you are running. From the info you have provided it looks like an old version.

If it is, then it would be great if you can upgarde to one of the latest supported release.


[1] http://docs.gluster.org/en/latest/Troubleshooting/split-brain/#fixing-directory-entry-split-brain

 

Thanks & Regards,

Karthik

On Wed, Sep 27, 2017 at 9:42 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou@xxxxxxxxxxxxxxx> wrote:

 

HI gluster experts,

 

I meet a tough problem about “split-brain” issue. Sometimes, after hard reboot, we will find some files in split-brain, however its parent directory or anything could be shown in command “gluster volume heal <volume-name> info”, also, no entry in .glusterfs/indices/xattrop directory, can you help to shed some lights on this issue? Thanks!

 

 

 

Following is some info from our env,

 

Checking from sn-0 cliet, nothing is shown in-split-brain!

 

[root@sn-0:/mnt/bricks/services/brick/netserv/ethip]

# gluster v heal services info

Brick sn-0:/mnt/bricks/services/brick/

Number of entries: 0

 

Brick sn-1:/mnt/bricks/services/brick/

Number of entries: 0

 

[root@sn-0:/mnt/bricks/services/brick/netserv/ethip]

[root@sn-0:/mnt/bricks/services/brick/netserv/ethip]

# gluster v heal services info split-brain

Gathering list of split brain entries on volume services has been successful

 

Brick sn-0.local:/mnt/bricks/services/brick

Number of entries: 0

 

Brick sn-1.local:/mnt/bricks/services/brick

Number of entries: 0

 

[root@sn-0:/mnt/bricks/services/brick/netserv/ethip]

# ls -l /mnt/services/netserv/ethip/

ls: cannot access '/mnt/services/netserv/ethip/sn-2': Input/output error

ls: cannot access '/mnt/services/netserv/ethip/mn-1': Input/output error

total 3

-rw-r--r-- 1 root root 144 Sep 26 20:35 as-0

-rw-r--r-- 1 root root 144 Sep 26 20:35 as-1

-rw-r--r-- 1 root root 145 Sep 26 20:35 as-2

-rw-r--r-- 1 root root 237 Sep 26 20:36 mn-0

-????????? ? ?    ?      ?            ? mn-1

-rw-r--r-- 1 root root  73 Sep 26 20:35 sn-0

-rw-r--r-- 1 root root  73 Sep 26 20:35 sn-1

-????????? ? ?    ?      ?            ? sn-2

[root@sn-0:/mnt/bricks/services/brick/netserv/ethip]

 

Checking from glusterfs server side, the gfid of mn-1 on sn-0 and sn-1 is different

 

[SN-0]

[root@sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3]

# getfattr -m . -d -e hex /mnt/bricks/services/brick/netserv/ethip

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/services/brick/netserv/ethip

trusted.gfid=0xee71d19ac0f84f60b11eb42a083644e4

trusted.glusterfs.dht=0x000000010000000000000000ffffffff

 

[root@sn-0:/mnt/bricks/services/brick/netserv/ethip]

# getfattr -m . -d -e hex mn-1

# file: mn-1

trusted.afr.dirty=0x000000000000000000000000

trusted.afr.services-client-0=0x000000000000000000000000

trusted.afr.services-client-1=0x000000000000000000000000

trusted.gfid=0x53a33f437464475486f31c4e44d83afd

[root@sn-0:/mnt/bricks/services/brick/netserv/ethip]

# stat mn-1

  File: mn-1

  Size: 237              Blocks: 16         IO Block: 4096   regular file

Device: fd51h/64849d    Inode: 2536        Links: 2

Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)

Access: 2017-09-26 20:30:25.679000000 +0300

Modify: 2017-09-26 20:30:24.604000000 +0300

Change: 2017-09-26 20:30:24.610000000 +0300

Birth: -

[root@sn-0:/mnt/bricks/services/brick/.glusterfs/indices/xattrop]

# ls

xattrop-63f8bbcb-7fa6-4fc8-b721-675a05de0ab3

[root@sn-0:/mnt/bricks/services/brick/.glusterfs/indices/xattrop]

 

[root@sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3]

# ls

53a33f43-7464-4754-86f3-1c4e44d83afd

[root@sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3]

# stat 53a33f43-7464-4754-86f3-1c4e44d83afd

  File: 53a33f43-7464-4754-86f3-1c4e44d83afd

  Size: 237              Blocks: 16         IO Block: 4096   regular file

Device: fd51h/64849d    Inode: 2536        Links: 2

Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)

Access: 2017-09-26 20:30:25.679000000 +0300

Modify: 2017-09-26 20:30:24.604000000 +0300

Change: 2017-09-26 20:30:24.610000000 +0300

Birth: -

 

#

[SN-1]

 

[root@sn-1:/mnt/bricks/services/brick/.glusterfs/f7/f1]

#  getfattr -m . -d -e hex /mnt/bricks/services/brick/netserv/ethip

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/services/brick/netserv/ethip

trusted.gfid=0xee71d19ac0f84f60b11eb42a083644e4

trusted.glusterfs.dht=0x000000010000000000000000ffffffff

 

[root@sn-1:/mnt/bricks/services/brick/.glusterfs/f7/f1]

#

[root@sn-1:/mnt/bricks/services/brick/netserv/ethip]

# getfattr -m . -d -e hex mn-1

# file: mn-1

trusted.afr.dirty=0x000000000000000000000000

trusted.afr.services-client-0=0x000000000000000000000000

trusted.afr.services-client-1=0x000000000000000000000000

trusted.gfid=0xf7f10f980acc4041a015e48018571d4a

 

[root@sn-1:/mnt/bricks/services/brick/netserv/ethip]

# stat mn-1

  File: mn-1

  Size: 237              Blocks: 16         IO Block: 4096   regular file

Device: fd41h/64833d    Inode: 2608        Links: 2

Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)

Access: 2017-09-26 20:31:48.231000000 +0300

Modify: 2017-09-26 20:31:46.872000000 +0300

Change: 2017-09-26 20:31:46.875000000 +0300

Birth: -

[root@sn-1:/mnt/bricks/services/brick/.glusterfs/indices/xattrop]

# ls

xattrop-240713ea-eda3-4914-a55d-7dd4aed724ed

[root@sn-1:/mnt/bricks/services/brick/.glusterfs/indices/xattrop]

 

[root@sn-1:/mnt/bricks/services/brick/.glusterfs/f7/f1]

# stat f7f10f98-0acc-4041-a015-e48018571d4a

  File: f7f10f98-0acc-4041-a015-e48018571d4a

  Size: 237              Blocks: 16         IO Block: 4096   regular file

Device: fd41h/64833d    Inode: 2608        Links: 2

Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)

Access: 2017-09-26 20:31:48.231000000 +0300

Modify: 2017-09-26 20:31:46.872000000 +0300

Change: 2017-09-26 20:31:46.875000000 +0300

Birth: -

 

 

Best regards,
Cynthia
(周琳)

MBB SM HETRAN SW3 MATRIX 

Storage        
Mobile: +86 (0)18657188311

 

 

 

Best regards,
Cynthia
(周琳)

MBB SM HETRAN SW3 MATRIX 

Storage        
Mobile: +86 (0)18657188311

 

 

 


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

 

 


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux