Re: Selfheal is not working? Once more

Kevan Benson <kbenson@xxxxxxxxxxxxxxx> · Wed, 30 Jul 2008 14:42:07 -0700

Previous quotes posts removed for brevity...

Martin Fick wrote:
It does seem like it would be fairly easy to add another 
metadata attribute to each file/directory that would hold
a checksum for it.  This way, AFR itself could be 
configured to check/compute the checksum anytime the file 
is read/written.  Since this would slow AFR down, I would
suggest a configuration option to turn this on.  If the
checksum is wrong, it could heal to the version of the
other brick if the other brick's checksum is correct.

Another alternative would be to create an offline 
checksummer that updates such an attribute if it does not
exist, and checks the checksum if it does exist.  If when
it checks the checksum it fails, it would simply delete the
file and its attributes (and potentially the directory
attributes up the tree) so that AFR will then heal it.

The only modification needed by AFR to support this
would be to delete the checksum attribute anytime the 
file/directory is updated so that the offline checksummer
will recreate it instead of thinking it is corrupt.  
In fact, even this could be eliminated so that the 
offline checksummer is completely "self-powered",
anytime it calculates a checksum it could copy the 
glusterfs version and timestamp attributes to two new
"checksummer" attributes.  If these become out of date the 
cheksummer will know to recompute the checksum instead of
assuming that the file has been corrupted.

The one risk with this is that if a file gets corrupted
on both nodes, it will get deleted on both nodes so you 
will not have a corrupted file to at least look at.  
This too could be overcome by saving any deleted files 
in a separate "trash can" and cleaning the trash can 
once  the files in it have been healed, sort of a self cleaning lost+found directory.

I know this may not be the answers that you were 
looking for, but I hope it helps clarify things 
a little.

A while back I seem to remember someone talking about eventually 
creating a fsck.glusterfs utility.  Since underlying server node 
corruption would (hopefully) not be a common problem, it seems like a 
specific tool that could be run when prudent would be a good approach. 
If the underlying data is suspected of corruption on a node, run the 
normal fsck on that node, then the fsck.glusterfs on the share utility 
which can utilize a much more comprehensive set of checks and repairs 
than would be feasible in normal AFR file processing.

--

-Kevan Benson
-A-1 Networks