Adding the correct gluster-devel id.
Regards,
Raghavendra Bhat
On 07/08/2015 11:38 AM, Raghavendra Bhat wrote:
Hi,
In bit-rot feature, the scrubber marks the corrupted (objects whose
data has gone bad) as bad objects (via extended attribute). If the
volume is a replicate volume and a object in one of the replicas goes
bad. In this case, the client is able to see the data via the good
copy present in the other replica. But as of now, the self-heal does
not heal the bad objects. So the method to heal the bad object is to
remove the bad object directly from the backend and let self-heal take
care of healing it from the good copy.
The above method has a problem. The bit-rot-stub xlator sitting in the
brick graph, remembers an object as bad in its inode context (either
when the object was being marked bad by scrubber, or during the first
lookup of the object if it was already marked bad). Bit-rot-stub uses
that info to block any read/write operations on such bad objects. So
it blocks any kind of operation attempted by self-heal as well to
correct the object (the object was deleted directly in the backend, so
the in memory inode will still be present and considered valid).
There are 2 methods that I think can solve the issue.
1) In server_lookup_cbk, if the lookup of a object fails due to
ENOENT *AND* the lookup is a revalidate lookup, then forget the
inode associated with that object (not just unlinking the dentry,
forget the inode as well iff there are no more dentries associated
with the inode). Atleast this way, the inode would be forgotten, and
later when self-heal wants to correct the object, it has to create a
new object (the object was removed directly from the backend), which
has to happen with the creation of a new in memory inode and
read/write operations by self-heal daemon will not be blocked.
I have sent a patch for review for the above method:
http://review.gluster.org/#/c/11489/
OR
2) Do not block write operations coming on the bad object if the
operation is coming from self-heal and allow it to completely heal the
file and once healing is done, remove the bad-object information from
the inode context.
The requests coming from self-heal demon can be identified by checking
the pid of it (it has -ve pid). But if the self-heal happening from
the glusterfs client itself, I am not sure whether self-heal happens
with a -ve pid for the frame or the same pid as that of the frame of
the original fop which triggered the self-heal. Pranith? Can you
clarify this?
Please provide feedback.
Regards,
Raghavendra Bhat
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel