RE: AWFUL reshape speed with raid5.

"David Lethe" <david@xxxxxxxxxxxx> · Mon, 28 Jul 2008 14:59:37 -0500

>-----Original Message-----
>From: linux-raid-owner@xxxxxxxxxxxxxxx
[mailto:linux-raid-owner@xxxxxxxxxxxxxxx] On Behalf Of Justin Piszcz
>Sent: Monday, July 28, 2008 2:44 PM
>To: Jon Nelson
>Cc: LinuxRaid
>Subject: Re: AWFUL reshape speed with raid5.
>
>There once was a bug in an earlier kernel, in which the min_speed is
what 
>the rebuild ran at if you had a specific chunk size, have you tried to 
>echo 30000 > to min_speed?  Does it increase it to 30mb/s for the
rebuild?
>
>On Mon, 28 Jul 2008, Jon Nelson wrote:
>
>> Some more data points, observations, and questions.
>>
>> For each test, I'd --create the array, drop the caches, --grow, and
>> then watch vmstat and also record the time between
>>
>> kernel: md: resuming resync of md99 from checkpoint.
>> and
>> kernel: md: md99: resync done.
>>
>> I found two things:
>>
>> 1. metadata version matters. Why?
>> 2. VERY LITTLE I/O takes place (between 0 and 100KB/s, typically no
>> I/O at all) according to vmstat. Why? If it takes 1m34s to "grow" the
>> array, but no I/O is taking place, then what is actually taking so
>> long?
>> 3. I removed the bitmap for these tests. Having a bitmap meant that
>> the overall speed was REALLY HORRIBLE.
>>
>> The results:
>>
>> metadata: time taken
>>
>> 0.9: 27s
>> 1.0: 27s
>> 1.1: 37s
>> 1.2: 1m34s
>>
>> Questions (repeated):
>>
>> 1. Why does the metadata version matter so much?
>> 2. If no I/O is taking place, why does it take so long? [ NOTE: I/O
>> must be taking place but why doesn't vmstat show it? ]
>>
>> -- 
>> Jon
>>

You are incorrectly working from the premise that vmstat is measures
disk activity.  It does not.  Vmstat has no idea how many actual bytes
get sent to, or received from disk drives.

Why not do a real test and hook up a pair of SAS, SCSI, or FC disks,
then issue some LOG SENSE commands to report the actual number of bytes
read & written to each disk during the rebuild? If the disks are
FibreChannel, then you have even more ways to measure true throughput in
bytes.  It will not be an estimate, it will be a real count of
cumulative bytes read, written, re-read/re-written, recovered, etc., for
any instant in time.  Heck, if you have Seagate and some other disks,
then you can even see detailed information for cached reads so you can
see if any particular md configuration results in a higher number of
cached I/Os, meaning greater efficiency and smaller overall latency.

David @ santools dot com

David

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html