favorite-child and self-heal BUG ?

freedman at FreeFormIT.com (Keith Freedman) · Thu, 08 Jan 2009 16:37:24 -0800

At 06:43 AM 1/8/2009, artur.k wrote:
>if the disc will fall on a single server glaster remove files from 
>the second (good server) ?
>
>noc-xx-2:/mnt/glusterfs# echo 7 > 7
>bash: 7: Input/output error
>noc-xx-2:/mnt/glusterfs#
>noc-xx-2:/mnt/glusterfs# echo 7 > 7
>bash: 7: Input/output error
>noc-xx-2:/mnt/glusterfs# echo 7 > 7
>noc-xx-2:/mnt/glusterfs#
>noc-xx-2:/mnt/glusterfs#
>
>
>ehh... if the 7 file exist only once server. on second server 
>deleted by rm -f  trac-xx-1:/var/storage/glusterfs/7

by deleting random files from the back-end filesystem, you are NOT 
simulating a disk failure.
if there is a disk failure, you end up with a CLEAN/EMPTY filesystem 
which has NO extended attributes.
As such, when you start gluster, it will see empty and no attributes. 
As such it will check with the other AFR/HA server and will see that 
is has extended attributes and they are up to date and that it is the 
place to get the latest files form.
Then it was auto-magically start 'healing' the entire disk.

I've, in fact, had this exact situation happen to me in late December 
and I watched magically as the entire filesystem was recovered from 
the other server.

by simply deleting parts of the filesystem, you are not affecting the 
extended attributes so once the filesystem is managed by gluster 
again, gluster tries to put it back into the last known good state 
from glusters perspective.

So, again, if you're trying to simulate a failed drive, then do this:
turn off gluster
delete the entire directory tree (including the last directory in the 
mountpoint) then make the directory again (this clears out the 
extended attributes) and then remoung the gluster mountpoint.

For the examples you gave, here's what I'm understanding:
on the client you have the gluster filesystem mounted on
/mnt/gluster

on the server you are serving files from:
/var/storage/glusterfs

Now, once you are using /var/storage/glusterfs within gluster you 
SHOULD NOT EVER modify this filesystem OUTSIDE of gluster.
IF you want to modify this filesystem you should mount it via gluster 
on the server, and then make changes that way.

so, if you want to remove a file: /var/storage/glusterfs/xyz
you should create a client volume on the server with the HA brick in 
it (it doesn't necessarily have to have both servers, it can have 
just itself as it's own server), but (I think) it needs to have the 
ha/afr brick configured since that's the layer which affects the 
extended attributes.

then mount this somewhere on the server /var/local/glusterfs and then 
you can rm /var/local/glusterfs/xyz
you can verify that it was removed from /var/storage/glusterfs/ and 
then you can restart the client and you'll see that it now auto-heals 
as you expect.

this is because the extended attributes on /var/storage/glusterfs are 
updated and correct.

I suppose the best analogy to what you're simulating with your 
examples would be to basically take a hard drive, scrach the platter, 
stick it back in and wonder why it's not giving you the right data.
you're effectively altering the "media" the filesytem runs on by 
altering the underlying filesystem outside of gluster.

As for your input/output errors, again, make sure you're on a recent 
build since I think most of those are addressed.