Fixing heal / split-brain when the entry is a directory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have a bunch of heal problems on a volume. For this email, I won't speculate about what caused them - that's a whole other discussion that I may have at some point in the future. This will concentrate on fixing the immediate problems so I can move forward.

Thanks to JoeJulian's blog posts and talking to him in the IRC channel, I have a pretty good handle on how to fix entries in the 'heal $vol info' output ... but only if the entry given refers to a real *file* or a gluster link file. Almost all of the entries in my report are directories, and I have no idea how to fix it.

All I have for these entries is gfid values, so I first locate the entry in .glusterfs. In this case, it's a symlink.

[root@slc01dfs001a ~]# stat /bricks/d00v00/mdfs/.glusterfs/fe/93/fe93de6e-5b91-4193-a31c-786726886ff1 File: `/bricks/d00v00/mdfs/.glusterfs/fe/93/fe93de6e-5b91-4193-a31c-786726886ff1' -> `../../a7/30/a730505c-84f3-407f-ac27-d45465a17f40/331'
  Size: 52              Blocks: 0          IO Block: 4096   symbolic link
Device: fd06h/64774d    Inode: 2152112572  Links: 1
Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2013-06-21 03:17:27.740839811 -0600
Modify: 2013-06-21 03:17:27.740839811 -0600
Change: 2013-06-21 03:17:27.740839811 -0600

To figure out what the actual directory name is, I use readlink:

[root@slc01dfs001a ~]# readlink -f /bricks/d00v00/mdfs/.glusterfs/fe/93/fe93de6e-5b91-4193-a31c-786726886ff1
/bricks/d00v00/mdfs/REDACTED/mdfs/RTR/rtrphotosfour/docs/331

I can get the extended attributes. I know from talking to Joe Julian that the following output means both copies think the other needs healing. If I compare 'ls -al' output from the brick directory on both copies, they are the same.

[root@slc01dfs001a ~]# getfattr -m . -d -e hex /bricks/d00v00/mdfs/REDACTED/mdfs/RTR/rtrphotosfour/docs/331
getfattr: Removing leading '/' from absolute path names
# file: bricks/d00v00/mdfs/REDACTED/mdfs/RTR/rtrphotosfour/docs/331
trusted.afr.mdfs-client-0=0x00000000000000000000006e
trusted.afr.mdfs-client-1=0x00000000000000000000006e
trusted.gfid=0xfe93de6e5b914193a31c786726886ff1
trusted.glusterfs.dht=0x00000001000000003ffffffc4ffffffa

Now for the big question ... what do I do, in a step-by-step format, to eliminate this entry from the heal info output? On another entry, I tried deleting the second trusted.afr entry on both copies, I tried deleting them both, I tried deleting one and setting the other to zero, and I tried changing them to both to zero. In between each of these, I did a stat on the directory via the FUSE mount. It did not change the heal info output.

Thanks,
Shawn
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux