Re: RAID10 Write Performance

NeilBrown <neilb@xxxxxxxx> · Thu, 24 Dec 2015 09:58:32 +1100

On Thu, Dec 24 2015, Marc Smith wrote:

> Okay, thanks, I'll turn it back on and try some different chunk sizes.
>
> For my own knowledge, why/what is taking place under the covers that
> causes this behavior? When testing with fio, it sometimes takes 1-2
> minutes of "ramp up time" before the performance numbers are
> good/expected (when the write-intent bitmap is enabled).
>

Whenever md needs to write to a region (a bitmap-chunk) of the array
that it hasn't written to recently, it needs to set a bit and write out
the bitmap first.  It tries to gather multiple writes together and set
several bits at once, but a synchronous workload will defeat that.
Once the bit is set it will stay set until several seconds after the
last write.  I think it defaults to 5 seconds.
  mdadm -X /dev/some-component
will list it as 'daemon sleep'.

So the delay you are seeing is the time it takes to get all of those
bits set.  1 minute does sound like a long time, though it the writes
are synchronous that would easily explain it.
With a larger chunk size, there are fewer bits to set so fewer time that
the drives need to seek to the other end of the disk to write out the
bitmap.

If you run "watch -n 0.1 mdadm -X /dev/something" in a window it will
report how many bits are set moment by moment.  That might give you some
feel for what is happening.

NeilBrown

>
> Thanks,
>
> Marc
>
>
> On Tue, Dec 22, 2015 at 9:20 PM, NeilBrown <neilb@xxxxxxxx> wrote:
>> On Wed, Dec 23 2015, Marc Smith wrote:
>>
>>> Solved... appears it was the write-intent bitmap that caused the
>>> performance issues. I discovered if I left the test running longer
>>> than 60 seconds, the performance would eventually climb to where I'd
>>> expect it. I ran 'mdadm --grow --bitmap=none /dev/md0' and now random
>>> write performance is high/good/stable right off the bat.
>>
>> Keeping a write-intent bitmap really is a good idea.
>> Using a larger bitmap chunk size can reduce the performance penalty and
>> preserve much of the value.  It is easy enough to experiment with
>> different sizes.
>>
>> NeilBrown
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Attachment:
signature.asc

Description: PGP signature