Re: Question about raid5 disk recovery logic

Alexander Lyakas <alex.bolshoy@xxxxxxxxx> · Sun, 1 Jul 2012 16:36:51 +0300

Thanks, Neil!
That clarifies.

Does this also mean, that when md_do_sync() gets to such
already-reconstructed stripe, it might reconstruct it once again,
unless the stripe stays in the stripe cache?

Thanks for helping,
Alex.

On Sun, Jul 1, 2012 at 11:00 AM, NeilBrown <neilb@xxxxxxx> wrote:
> On Sun, 1 Jul 2012 10:08:40 +0300 Alexander Lyakas <alex.bolshoy@xxxxxxxxx>
> wrote:
>
>> Hi everybody,
>> I am trying to understand what happens when raid5 is recovering a
>> disk, and a write comes to a stripe that has not been recovered yet.
>> Does md first reconstruct the missing chunk and then applies the
>> write, or first the write is applied as if the array is still degraded
>> (and not recovering), and only later the missing chunk is
>> reconstructed (when the md_do_sync() loop gets to this area)?
>> I am looking at the stripe handling logic (kernel 2.6.38), can anybody
>> pls point me at the path that handle_stripe5() takes in that case?
>>
>>
>
> Hi Alex,
>
>  The stripe is still degraded, so md/raid5 treats it like a write to a
>  degraded array.
>  Exactly what happens depends one which block is being written.
>  If the block being written would be stored on the recovering devices, then
>  md will perform a reconstruct-write.  It will read the other data blocks,
>  calculate the parity, and write out the parity and the changed data.
>  Similarly if the parity block is on the recovering device a
>  reconstruct-write will be needed.
>  If some other block is being written, md will do a read-modify-write to
>  calculate the new parity and then write out the parity and data.  In this
>  case the block on the recovering device will not be written.
>
>  I hope that clarifies the situation.
>
> NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html