Re: Split-brain seen with [0 0] pending matrix and io-cache page errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/18/2014 04:36 PM, Anirban Ghoshal wrote:
Hi,

Yes, they do, and considerably. I'd forgotten to mention that on my last email. Their mtimes, however, as far as i could tell on separate servers, seemed to coincide.

Thanks,
Anirban


Are these files always open? And is it possible that the file could have been renamed when one of the bricks was offline? I know of a race which can introduce this one. Just trying to find if it is the same case.

Pranith


From: Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx>;
To: Anirban Ghoshal <chalcogen_eg_oxygen@xxxxxxxxx>; gluster-users@xxxxxxxxxxx <gluster-users@xxxxxxxxxxx>;
Subject: Re: Split-brain seen with [0 0] pending matrix and io-cache page errors
Sent: Sat, Oct 18, 2014 12:26:08 AM

hi,
      Could you see if the size of the file mismatches?

Pranith

On 10/18/2014 04:20 AM, Anirban Ghoshal wrote:
Hi everyone,

I have this really confusing split-brain here that's bothering me. I am running glusterfs 3.4.2 over linux 2.6.34. I have a replica 2 volume 'testvol' that is It seems I cannot read/stat/edit the file in question, and `gluster volume heal testvol info split-brain` shows nothing. Here are the logs from the fuse-mount for the volume:

[2014-09-29 07:53:02.867111] W [fuse-bridge.c:1172:fuse_err_cbk] 0-glusterfs-fuse: 4560969: FLUSH() ERR => -1 (Input/output error) 
[2014-09-29 07:54:16.007799] W [page.c:991:__ioc_page_error] 0-testvol-io-cache: page error for page = 0x7fd5c8529d20 & waitq = 0x7fd5c8067d40 
[2014-09-29 07:54:16.007854] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 4561103: READ => -1 (Input/output error) 
[2014-09-29 07:54:16.008018] W [page.c:991:__ioc_page_error] 0-testvol-io-cache: page error for page = 0x7fd5c8607ee0 & waitq = 0x7fd5c8067d40 
[2014-09-29 07:54:16.008056] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 4561104: READ => -1 (Input/output error) 
[2014-09-29 07:54:16.008233] W [page.c:991:__ioc_page_error] 0-testvol-io-cache: page error for page = 0x7fd5c8066f30 & waitq = 0x7fd5c8067d40 
[2014-09-29 07:54:16.008269] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 4561105: READ => -1 (Input/output error) 
[2014-09-29 07:54:16.008800] W [page.c:991:__ioc_page_error] 0-testvol-io-cache: page error for page = 0x7fd5c860bcf0 & waitq = 0x7fd5c863b1f0 
[2014-09-29 07:54:16.008839] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 4561107: READ => -1 (Input/output error) 
[2014-09-29 07:54:16.009365] W [page.c:991:__ioc_page_error] 0-testvol-io-cache: page error for page = 0x7fd5c85fd120 & waitq = 0x7fd5c8067d40 
[2014-09-29 07:54:16.009413] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 4561109: READ => -1 (Input/output error) 
[2014-09-29 07:54:16.040549] W [afr-open.c:213:afr_open] 0-testvol-replicate-0: failed to open as split brain seen, returning EIO 
[2014-09-29 07:54:16.040594] W [fuse-bridge.c:915:fuse_fd_cbk] 0-glusterfs-fuse: 4561142: OPEN() /SECLOG/20140908.d/SECLOG_00000000000000427425_00000000000000000000.log => -1 (Input/output error)

Could somebody please give me some clue on where to begin? I checked the xattrs on /SECLOG/20140908.d/SECLOG_00000000000000427425_00000000000000000000.log and it seems the changelogs are [0, 0] on both replicas, and the gfid's match.

Thank you very much for any help on this.
Anirban





_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux