Re: healing of bad objects (marked by scrubber)

Raghavendra Bhat <rabhat@xxxxxxxxxx> · Wed, 08 Jul 2015 11:42:41 +0530

Adding the correct gluster-devel id.

Regards,
Raghavendra Bhat

On 07/08/2015 11:38 AM, Raghavendra Bhat wrote:

Hi,

In bit-rot feature, the scrubber marks the corrupted (objects whose 
data has gone bad) as bad objects (via extended attribute). If the 
volume is a replicate volume and a object in one of the replicas goes 
bad. In this case, the client is able to see the data via the good 
copy present in the other replica. But as of now, the self-heal does 
not heal the bad objects.  So the method to heal the bad object is to 
remove the bad object directly from the backend and let self-heal take 
care of healing it from the good copy.

The above method has a problem. The bit-rot-stub xlator sitting in the 
brick graph, remembers an object as bad in its inode context (either 
when the object was being marked bad by scrubber, or during the first 
lookup of the object if it was already marked bad). Bit-rot-stub uses 
that info to block any read/write operations on such bad objects. So 
it blocks any kind of operation attempted by self-heal as well to 
correct the object (the object was deleted directly in the backend, so 
the in memory inode will still be present and considered valid).

There are 2 methods that I think can solve the issue.

1) In server_lookup_cbk, if the lookup of a object fails due to 
ENOENT  *AND*  the lookup is a revalidate lookup, then forget the 
inode associated with that object (not just unlinking the dentry, 
forget the inode as well iff there are no more dentries associated 
with the inode). Atleast this way, the inode would be forgotten, and 
later when self-heal wants to correct the object, it has to create a 
new object (the object was removed directly from the backend), which 
has to happen with the creation of a new in memory inode and 
read/write operations by self-heal daemon will not be blocked.
I have sent a patch for review for the above method:
http://review.gluster.org/#/c/11489/

OR

2) Do not block write operations coming on the bad object if the 
operation is coming from self-heal and allow it to completely heal the 
file and once healing is done, remove the bad-object information from 
the inode context.
The requests coming from self-heal demon can be identified by checking 
the pid of it (it has -ve pid). But if the self-heal happening from 
the glusterfs client itself, I am not sure whether self-heal happens 
with a -ve pid for the frame or the same pid as that of the frame of 
the original fop which triggered the self-heal. Pranith? Can you 
clarify this?

Please provide feedback.

Regards,
Raghavendra Bhat

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel