Re: Recovering badly corrupted directory

"Sincock, John [FLCPTY]" <J.Sincock@xxxxxxxxx> · Thu, 3 Sep 2015 10:50:53 +0930

Hi Everybody,
Perhaps I asked too many questions at once in my first mail, sorry... 

But if anyone can provide any info on the one question below, it might
help,

Q) I realise that if a file has ------T perms, zero size, and a linkto
xattr, then it is a gluster linkto file.

But, we also have other ------T perm files, which are puzzling:
 - files which DO have a linkto xattr, but also have size > 0.  What are
these?! If the file is a gluster linkto, then why is its size > 0?
 - files which do NOT have a linkto xattr, and have size > 0. The data
in these files seems to be just endless NULs (Hex 00).  Does anyone have
any idea what these files are and why they've ended up with ------T
perms? 

If anyone can explain what these files are, I'd be grateful.

Thanks again.
John

-----Original Message-----
From: Sincock, John [FLCPTY] 
Sent: Wednesday, 26 August 2015 3:55 PM
To: gluster-users
Subject: Recovering badly corrupted directory 

Hi Everybody,

I'm trying to recover a badly corrupted directory on our gluster, and
need some advice:

It looks like we've hit this bug here, which was reported against
gluster 2.1 and is unresolved:
https://bugzilla.redhat.com/show_bug.cgi?id=1034148
Bug 1034148 - "DHT : on lookup getting error ' cannot read symbolic link
<dir1>: Invalid argument' or 'Input/output error' and logs says
"[posix.c:737:posix_readlink] 0-flat-posix: readlink on <dir> failed
Invalid argument" + it shows directory twice in output"

Some googling shows  what looks like the same bug on glusterfs 3.3 here:
http://www.gluster.org/pipermail/gluster-users/2013-August/014038.html
and glusterfs 3.4.2-1 here:
http://www.gluster.org/pipermail/gluster-users/2014-February/016271.html

We are running gluster 3.4.1-3.el6.x86_64 on centos 6.4

Our data for the corrupted folder appears to exist on the bricks  but is
unusable via the gluster volume.
There are many files with ------T permissions, many of which have zero
size, others have data

Here is what I get when I list the original problem directory:

ls -la
/gluster/vol00/archive/Online_Archive/Survey/Riegl/2014/Saudi/H11_Riegl_
RiAcquire_Raw_Data/14_11_SAS18_1_RiAcquire\ \(FieldRawData\)/

ls: cannot read symbolic link
/gluster/vol00/archive/Online_Archive/Survey/Riegl/2014/Saudi/H11_Riegl_
RiAcquire_Raw_Data/14_11_SAS18_1_RiAcquire
(FieldRawData)/14_11_140823_S004_DNP: Invalid argument

ls: cannot read symbolic link
/gluster/vol00/archive/Online_Archive/Survey/Riegl/2014/Saudi/H11_Riegl_
RiAcquire_Raw_Data/14_11_SAS18_1_RiAcquire
(FieldRawData)/14_11_140902_S021: Invalid argument

<snips many more like this>
total 0

lrwxrwxrwx 0 root    root     70 Jul 13 16:30 14_11_140819_S001_DNP ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140819_S001_DNP

lrwxrwxrwx 0 root    root     70 Jul 13 16:30 14_11_140819_S001_DNP ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140819_S001_DNP

lrwxrwxrwx 0 root    root     70 Jul 13 16:30 14_11_140819_S001_DNP ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140819_S001_DNP

lrwxrwxrwx 0 root    root     70 Jul 13 16:30 14_11_140819_S001_DNP ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140819_S001_DNP

lrwxrwxrwx 0 root    root     70 Jul 13 16:30 14_11_140819_S001_DNP ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140819_S001_DNP

lrwxrwxrwx 0 root    root     66 Jul 13 16:31 14_11_140821_S002 ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140821_S002

lrwxrwxrwx 0 root    root     66 Jul 13 16:31 14_11_140821_S002 ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140821_S002

lrwxrwxrwx 0 root    root     66 Jul 13 16:31 14_11_140821_S002 ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140821_S002

lrwxrwxrwx 0 root    root     66 Jul 13 16:31 14_11_140821_S002 ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140821_S002

lrwxrwxrwx 0 root    root     66 Jul 13 16:31 14_11_140821_S002 ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140821_S002

lrwxrwxrwx 0 root    root     70 Jul 13 16:31 14_11_140821_S003_DNP ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140821_S003_DNP

lrwxrwxrwx 0 root    root     70 Jul 13 16:31 14_11_140821_S003_DNP ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140821_S003_DNP

lrwxrwxrwx 0 root    root     70 Jul 13 16:31 14_11_140821_S003_DNP ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140821_S003_DNP

lrwxrwxrwx 0 root    root     70 Jul 13 16:31 14_11_140821_S003_DNP ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140821_S003_DNP

lrwxrwxrwx 0 root    root     70 Jul 13 16:31 14_11_140821_S003_DNP ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140821_S003_DNP

lrwxrwxrwx 0 root    root     70 Jul 13 16:32 14_11_140823_S004_DNP

lrwxrwxrwx 0 root    root     70 Jul 13 16:32 14_11_140823_S004_DNP

lrwxrwxrwx 1 root    root     70 Jul 13 16:32 14_11_140823_S004_DNP

lrwxrwxrwx 0 root    root     70 Jul 13 16:32 14_11_140823_S004_DNP

lrwxrwxrwx 0 root    root     70 Jul 13 16:32 14_11_140823_S004_DNP

lrwxrwxrwx 0 root    root     66 Jul 13 14:38 14_11_140823_S005 ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140823_S005

lrwxrwxrwx 0 root    root     66 Jul 13 14:38 14_11_140823_S005 ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140823_S005

lrwxrwxrwx 0 root    root     66 Jul 13 14:38 14_11_140823_S005 ->
../../a9/65/a9650fe5-0436-472f-8f75-4d7b5bf0e676/14_11_140823_S005

<snips many more like this>

As you can see, the listing is a catastrophic mess, with a multitude of
duplicate entries, symbolic links which give this "invalid argument"
nonsense, and all the symlinks to ../../a9/65/blahblah are broken.

I think I've managed to restore access to most of the data by rsyncing
from its original location on each brick, to a new location on the brick
((not copying xattributes), and then recursively listing the new copy
via the gluster volume to pull the new data into the gluster volume. 

The copied data we've brought back into the gluster volume looks like it
is all or mostly there, but there are still quite a few duplicates of
many files with the ------T permissions in the new copy, so trying to
rsync from this new copy throws errors saying structure needs to be
cleaned and items failed verification, not surprising as rsync would
have no way of properly coping with these duplicates.

Eg here is a small subfolder:
ll "/gluster/vol00/gluster-recovery-2/14_11_SAS18_1_RiAcquire
(FieldRawData)/14_11_141023_S091_DNP/08_RECEIVED/"
total 3165
-rwxr-xr-x 1 survey1 surveyor 2702101 Oct 23  2014 14_11_141023_S091.rhk
---------T 1 survey1 surveyor 2702101 May 27 04:49 14_11_141023_S091.rhk
-rwxr-xr-x 1 survey1 surveyor  407859 Oct 23  2014 14_11_141023_S091.rpc
---------T 1 survey1 surveyor  407859 May 27 04:49 14_11_141023_S091.rpc
-rwxr-xr-x 1 survey1 surveyor   28653 Oct 23  2014 14_11_141023_S091.rpl
-rwxr-xr-x 1 survey1 surveyor  101609 Oct 23  2014 14_11_141023_S091.rpp

Note the duplicates with ------T perms.

The question s I have are:

1) Is there a better way to clean up this corruption? 

2) What are these ------T files and why are they there?!?!?!?!

3) what has caused the initial problem which corrupted our data? 
	I was on leave when this problem was noticed, but my colleagues
are pretty sure the directory looked OK after our last rebalance, so we
do not think this problem occurred during rebalance.
	We have 3 nodes in our cluster, and one of them is having issues
with occasional spontaneous reboots, so it does drop off the gluster at
times and then returns. But I don't think we have modified or moved any
of the corrupted data recently, so I do not think the problem has been
caused by data being moved during rebalance or by the node rebooting
while data was being manually moved from one place to another.

4) Can I just delete the remaining ------T files from the data we copied
and readded into the volume?
	If I do so, is there any chance that the other duplicate is bad,
and the T-file is the good copy that I should've saved?
	What are the chances this cleanup has fixed everything? Is there
still likely to be corrupt and/or missing files?

5) If we can get our copy of this data cleaned up, we would like to
delete the original corrupted folder from the volume, by going behind
gluster and deleting the data off the bricks.
	What is the correct procedure for doing this? 
	Ie if we delete the bad data off all the bricks, this will leave
files or links in the .glusterfs folder won't it? 
	How do I find the correct files under .glusterfs to delete? Or
do I just delete from bricks and then have to wait until the next time
we do a rebalance, and let the rebalance clean up the mess?

Any advice would be appreciated!

Thanks muchly,
John

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users