On 01/28/2015 10:58 PM, Ml Ml wrote:
"/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" is a binary file.
Here is the output of gluster volume info:
--------------------------------------------------------------------------------------
[root@ovirt-node03 ~]# gluster volume info
Volume Name: RaidVolB
Type: Replicate
Volume ID: e952fd41-45bf-42d9-b494-8e0195cb9756
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: ovirt-node03.example.local:/raidvol/volb/brick
Brick2: ovirt-node04.example.local:/raidvol/volb/brick
Options Reconfigured:
storage.owner-gid: 36
storage.owner-uid: 36
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
auth.allow: *
user.cifs: disable
nfs.disable: on
[root@ovirt-node04 ~]# gluster volume info
Volume Name: RaidVolB
Type: Replicate
Volume ID: e952fd41-45bf-42d9-b494-8e0195cb9756
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: ovirt-node03.example.local:/raidvol/volb/brick
Brick2: ovirt-node04.example.local:/raidvol/volb/brick
Options Reconfigured:
nfs.disable: on
user.cifs: disable
auth.allow: *
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
storage.owner-uid: 36
storage.owner-gid: 36
Here is the getfattr command in node03 and node 04:
--------------------------------------------------------------------------------------
getfattr -d -m . -e hex
/raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
getfattr: Removing leading '/' from absolute path names
# file: raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
trusted.afr.RaidVolB-client-0=0x000000000000000000000000
trusted.afr.RaidVolB-client-1=0x000000000000000000000000
trusted.afr.dirty=0x000000000000000000000000
trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73
[root@ovirt-node04 ~]# getfattr -d -m . -e hex
/raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
getfattr: Removing leading '/' from absolute path names
# file: raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
trusted.afr.RaidVolB-client-0=0x000000000000000000000000
trusted.afr.RaidVolB-client-1=0x000000000000000000000000
trusted.afr.dirty=0x000000000000000000000000
trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73
These xattrs seem to indicate there is no split-brain for the file,
heal-info also shows 0 entries on both bricks.
Are you getting I/O error when you read
"1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" from the mount?
If yes, is there a difference in file size on both nodes? How about the
contents (check if md5sum is same)?
Am i supposed to run those commands on the mounted brick?:
--------------------------------------------------------------------------------------
127.0.0.1:RaidVolB on
/rhev/data-center/mnt/glusterSD/127.0.0.1:RaidVolB type fuse.glusterfs
(rw,default_permissions,allow_other,max_read=131072)
At the very beginning i thought i removed the file with "rm
/raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids"
hoping gluster would then fix itself somehow :)
It was gone but it seems to be here again. Dunno if this is any help.
Here is gluster volume heal RaidVolB info on both nodes:
--------------------------------------------------------------------------------------
[root@ovirt-node03 ~]# gluster volume heal RaidVolB info
Brick ovirt-node03.example.local:/raidvol/volb/brick/
Number of entries: 0
Brick ovirt-node04.example.local:/raidvol/volb/brick/
Number of entries: 0
[root@ovirt-node04 ~]# gluster volume heal RaidVolB info
Brick ovirt-node03.example.local:/raidvol/volb/brick/
Number of entries: 0
Brick ovirt-node04.example.local:/raidvol/volb/brick/
Number of entries: 0
Thanks a lot,
Mario
On Wed, Jan 28, 2015 at 4:57 PM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:
On 01/28/2015 08:34 PM, Ml Ml wrote:
Hello Ravi,
thanks a lot for your reply.
The Data on ovirt-node03 is the one which i want.
Here are the infos collected by following the howto:
https://github.com/GlusterFS/glusterfs/blob/master/doc/debugging/split-brain.md
[root@ovirt-node03 ~]# gluster volume heal RaidVolB info split-brain
Gathering list of split brain entries on volume RaidVolB has been
successful
Brick ovirt-node03.example.local:/raidvol/volb/brick
Number of entries: 0
Brick ovirt-node04.example.local:/raidvol/volb/brick
Number of entries: 14
at path on brick
-----------------------------------
2015-01-27 17:33:00 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73>
2015-01-27 17:34:01 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73>
2015-01-27 17:35:04 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73>
2015-01-27 17:36:05 <gfid:cd411b57-6078-4f3c-80d1-0ac1455186a6>/ids
2015-01-27 17:37:06 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
2015-01-27 17:37:07 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
2015-01-27 17:38:08 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
2015-01-27 17:38:21 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
2015-01-27 17:39:22 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
2015-01-27 17:40:23 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
2015-01-27 17:41:24 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
2015-01-27 17:42:25 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
2015-01-27 17:43:26 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
2015-01-27 17:44:27 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
[root@ovirt-node03 ~]# gluster volume heal RaidVolB info
Brick ovirt-node03.example.local:/raidvol/volb/brick/
Number of entries: 0
Brick ovirt-node04.example.local:/raidvol/volb/brick/
Number of entries: 0
Hi Mario,
Is "/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" a file or a directory?
Whatever it is, it should be shown in the output of heal info /heal info
split-brain command of both nodes. But I see it being listed only under
node03.
Also, heal info is showing zero entries for both nodes which is strange.
Are node03 and node04 bricks of the same replica pair? Can you share
'gluster volume info` of RaidVolB?
How did you infer that there is a split-brain? Does accessing the file(s)
from the mount give input/output error?
[root@ovirt-node03 ~]# getfattr -d -m . -e hex
/raidvol/volb/brick/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
getfattr: Removing leading '/' from absolute path names
# file: raidvol/volb/brick/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
trusted.afr.RaidVolB-client-0=0x000000000000000000000000
trusted.afr.RaidVolB-client-1=0x000000000000000000000000
trusted.afr.dirty=0x000000000000000000000000
trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73
What is the getfattr output of this file on the other brick? The afr
specific xattrs being all zeros certainly don't indicate the possibility of
a split-brain
The "Resetting the relevant changelogs to resolve the split-brain: "
part of the howto is now a little complictaed. Do i have a data or
meta split brain now?
I guess i have a data split brain in my case, right?
What are my next setfattr commands nowin my case if i want to keep the
data from node03?
Thanks a lot!
Mario
On Wed, Jan 28, 2015 at 9:44 AM, Ravishankar N <ravishankar@xxxxxxxxxx>
wrote:
On 01/28/2015 02:02 PM, Ml Ml wrote:
I want to either take the file from node03 or node04. i really don’t
mind. Can i not just tell gluster that it should use one node as the
„current“ one?
Policy based split-brain resolution [1] which does just that, has been
merged in master and should be available in glusterfs 3.7.
For the moment, you would have to modify the xattrs on the one of the
bricks
and trigger heal. You can see
https://github.com/GlusterFS/glusterfs/blob/master/doc/debugging/split-brain.md
on how to do it.
Hope this helps,
Ravi
[1] http://review.gluster.org/#/c/9377/
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users