Re: md: raid5 resync corrects read errors on data block - is this correct?

NeilBrown <neilb@xxxxxxx> · Wed, 12 Sep 2012 08:29:09 +1000

On Tue, 11 Sep 2012 22:10:01 +0300 Alexander Lyakas <alex.bolshoy@xxxxxxxxx>
wrote:

> Hi Neil,
> I have been doing some investigation about resync on raid5, and I
> induced a repairable read error on a drive that contains a data block
> for a particular stripe_head. I saw that resync corrects the read
> error by recomputing the missing data block, rewriting it (twice) and
> then re-reading back.
> 
> Then I dug into code and eventually saw that fetch_block():
> 
> if ((s->uptodate == disks - 1) &&
>     (s->failed && (disk_idx == s->failed_num[0] ||
> 		   disk_idx == s->failed_num[1]))) {
> 	/* have disk failed, and we're requested to fetch it;
> 	 * do compute it
> 	 */
> 
> doesn't care whether disk_idx holds data or parity for this
> stripe_head. It only checks that stripe_head has enough redundancy to
> recompute the block.
> 
> My question: is this behavior correct? I mean that if we are doing
> resync, it means that we are not sure that parity is correct (e.g.,
> after an unclean shutdown). But we still use this possibly-incorrect
> parity to recompute the missing data block. So is this a potential
> bug?
>

Maybe.
I guess that if bad-block recording is enabled, then recording a bad block
there would be the "right" thing to do.
However if bad-block recording is not enabled then there are two options:
 1/ kick the device from the array
 2/ re-write the failing block based on the parity block which might be
    incorrect, but very probably is correct.

I suspect that the second option (which is currently implemented) will cause
less data loss than the first.

So I think the current code is the best option (unless we want to add some
bad-block-recording support).

NeilBrown
Attachment:
signature.asc

Description: PGP signature