Re: about raid5 recovery when created

Michael Evans <mjevans1983@xxxxxxxxx> · Wed, 9 Dec 2009 15:29:33 -0800

On Wed, Dec 9, 2009 at 3:30 AM, hank peng <pengxihan@xxxxxxxxx> wrote:
> 2009/12/9 Michael Evans <mjevans1983@xxxxxxxxx>:
>> On Tue, Dec 8, 2009 at 6:03 AM, hank peng <pengxihan@xxxxxxxxx> wrote:
>>> 2009/12/8 Robin Hill <robin@xxxxxxxxxxxxxxx>:
>>>> On Tue Dec 08, 2009 at 09:49:48PM +0800, hank peng wrote:
>>>>
>>>>> 2009/12/8 Robin Hill <robin@xxxxxxxxxxxxxxx>:
>>>>> > On Tue Dec 08, 2009 at 09:01:23PM +0800, hank peng wrote:
>>>>> >
>>>>> >> Hi, all:
>>>>> >> As we know, when a raid5 array is created, recovery will be going on
>>>>> >> which involves some read, one xor and one write. Since there is no
>>>>> >> real data in the disk at the time, besides, if I am willing to wait
>>>>> >> for recovery to complete and then use this raid5, how about adding
>>>>> >> support for a fast recovery method? Right now, what is in my mind is
>>>>> >> zero all disks which belong to this raid5. I think it will increase
>>>>> >> raid5 recovery speed when created and decrease CPU usage, since all
>>>>> >> zero is also XORed.
>>>>> >> What do raid developers think?
>>>>> >>
>>>>> > It'll decrease CPU usage but increase I/O - you're now needing to write
>>>>> > to all disks.  Most systems will be I/O limited rather than CPU limited,
>>>>> > so the current approach works better.  If you want to zero the disks
>>>>> > then do this before creating the array - you can then use --assume-clean
>>>>> > to skip the resync process.
>>>>> >
>>>>> I think --assume-clean is used mostly when doing performance test and
>>>>> can't be used when creating a raid5 array using new disk, because
>>>>> later read and write operation make assumption that all stripe is
>>>>> XORed. Correct me if I am wrong.
>>>>>
>>>> You're correct - that's why I said to zero all the disks first so the
>>>> XOR data is all correct.
>>>>
>>> I think this function is better to be implemented in kernel raid
>>> layer, not in user space(for example using dd command).
>>> In this way, we can get good performance and lower cpu usage, also, we
>>> can make this function be part of raid code so that it can be managed
>>> by mdadm
>>>> Cheers,
>>>>    Robin
>>>> --
>>>>     ___
>>>>    ( ' }     |       Robin Hill        <robin@xxxxxxxxxxxxxxx> |
>>>>   / / )      | Little Jim says ....                            |
>>>>  // !!       |      "He fallen in de water !!"                 |
>>>>
>>>
>>>
>>>
>>> --
>>> The simplest is not all best but the best is surely the simplest!
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>> How about documenting this better?  'zeroing all underlying devices
>'Lost in translation'
>> then creating with --assume-clean' will be clean because the parity
>> algorithm is even (or similar to 'even parity')?
>> --

Maybe this will translate more easily.

The documentation should be more explicit.  "When the devices the RAID
is made of are filled with zero's before RAID creation --assume-clean
can be used because the parity will already be correct."
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html