Re: md RAID with enterprise-class SATA or SAS drives

Roberto Spadim <roberto@xxxxxxxxxxxxx> · Mon, 21 May 2012 16:38:50 -0300

hum nice
raid10 and raid0 are single thread too?

2012/5/21 Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>:
> On 5/21/2012 1:54 PM, Roberto Spadim wrote:
>> hum, does anyone could explain what a 'multi thread' version of raid1
>> could be implemented?
>> for example, how to scale it? and why this new implementation could
>> scale it better
>
> I just did below.  You layer a stripe over many RAID 1 pairs.  A single
> md RAID 1 pair isn't enough to saturate a single core so there is no
> gain to be had by trying to thread the RAID 1 code.
>
> --
> Stan
>
>
>> 2012/5/21 Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>:
>>> On 5/21/2012 10:20 AM, CoolCold wrote:
>>>> On Sat, May 12, 2012 at 2:28 AM, Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx> wrote:
>>>>> On 5/11/2012 3:16 AM, Daniel Pocock wrote:
>>>>>
>>>> [snip]
>>>>> That's the one scenario where I abhor using md raid, as I mentioned.  At
>>>>> least, a boot raid 1 pair.  Using layered md raid 1 + 0, or 1 + linear
>>>>> is a great solution for many workloads.  Ask me why I say raid 1 + 0
>>>>> instead of raid 10.
>>>> So, I'm asking - why?
>>>
>>> Neil pointed out quite some time ago that the md RAID 1/5/6/10 code runs
>>> as a single kernel thread.  Thus when running heavy IO workloads across
>>> many rust disks or a few SSDs, the md thread becomes CPU bound, as it
>>> can only execute on a single core, just as with any other single thread.
>>>
>>> This issue is becoming more relevant as folks move to the latest
>>> generation of server CPUs that trade clock speed for higher core count.
>>>  Imagine the surprise of the op who buys a dual socket box with 2x 16
>>> core AMD Interlagos 2.0GHz CPUs, 256GB RAM, and 32 SSDs in md RAID 10,
>>> only to find he can only get a tiny fraction of the SSD throughput.
>>> Upon investigation he finds a single md thread peaking one core while
>>> the rest are relatively idle but for the application itself.
>>>
>>> As I understand Neil's explanation, the md RAID 0 and linear code don't
>>> run as separate kernel threads, but merely pass offsets to the block
>>> layer, which is fully threaded.  Thus, by layering md RAID 0 over md
>>> RAID 1 pairs, the striping load is spread over all cores.  Same with
>>> linear, avoiding the single thread bottleneck.
>>>
>>> This layering can be done with any md RAID level, creating RAID50s and
>>> RAID60s, or concatenations of RAID5/6, as well as of RAID 10.
>>>
>>> And it shouldn't take anywhere near 32 modern SSDs to saturate a single
>>> 2GHz core with md RAID 10.  It's likely less than 8 SSDs, which yield
>>> ~400K IOPS, but I haven't done verufication testing myself at this point.
>>>
>>> --
>>> Stan
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html