Re: single cpu thread performance limit?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi... sorry for the lack of initial info and your question  made me
realise how much i had missed off! hopefully this adds some color

PCIe based Flash - SLC based
Multuiple XEON 5640's  (total 16 cores)
MSI ints all set (and affinity / pinned tried)
SLES 11 (2.6.32.43-0.5)
tried on both a Supermicro and and Dell R server

the thread is MD0_RAID10 (or something simular as i am not near it now).
This thread is easily linked to the MD(s)
Create 4 x RAID1's and you have 4 x MD threads etc.

So, a single RAID10 creates a single thread - which will max at maybe 200K IOPS.
Create 4 x RAID10's seems OK, but they will not scale so great with a
RAID0 on top :(
Ideal would be a few threads per RAIDx


Using basic fio for IOPS (4 workers - 128 QD) - this usess hardly any
CPU resource.
Reads are maybe 50% faster as you would expect.

The issue seems to be the fact a single thread will only deliver X
before 100% CPU... with emerging flash, this is not reaching the
capability

FS:  An FS is not really an option for this solution, so we have not
tried this on this rig, but in the past the FS has degreaded the IOPS

Whilst a R0 on top of the R1/10's does offer some increase in
performance, linear does not :(
LVM R0 on top of the MD R1/10's does much the same results.
The limiter seems fixes on the single thread per R1/10


Thank you for any feedback!

Mark



On Thu, Aug 11, 2011 at 7:58 PM, Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx> wrote:
> On 8/11/2011 10:58 AM, mark delfman wrote:
>> I seem to have hit a significant hard stop in MD RAID1/10 performance
>> which seems to be linked to a single CPU thread.
>
> What is the name of the kernel thread that is peaking your cores?  Could
> the device driver be eating the CPU and not the md kernel threads?  Is
> it both?  Is it a different thread?  How much CPU is the IO generator
> app eating?
>
> What Linux kernel version are you running?  Which Linux distribution?
> What application are you using to generate the IO load?  Does it work at
> the raw device/partition level or at the file level?
>
>> I am using extremely high speed (IOPS) internal block devices – 8 in
>> total.  They are capable of achieving > 1million iops.
>
> 8 solid state drives of one model or another, probably occupying 8 PCIe
> slots.  IBIS, VeloDrive, the LSI SSD, or other PCIe based SSD?  Or are
> these plain SATA II SSDs that *claim* to have 125K 4KB random IOPS
> performance?
>
>> However if I use RAID1 / 10 then MD seems to use a single thread which
>> will reach 100% CPU utilisation (single core) at around 200K IOPS.
>> Limiting the entire performance to around 200K.
>
> CPU frequency?  How many sockets?  Total cores?  Whose box?  HP, Dell,
> IBM, whitebox, self built?  If the latter two, whose motherboard?  How
> many PCIe slots are occupied by the SSD cards?
>
>> If I use say 4 x RAID1 / 10’s and a RAID0 on top – I see not much
>> greater results. (although the theory seems to say I should and there
>> are now 4 CPU threads running, it still seems to hit 4 x 100% at maybe
>> 350K).
>
> Assuming you have 4 processors (cores), then yes, you should see better
> scaling.  If you have less cores than threads, then no.  Do you see more
> IOPS before running out of CPU when writing vs reading?  You should as
> you're doing half the IOs when reading.
>
>> Is there any way to increase the number of threads per RAID set? Or
>> any other suggestions on configurations?  (I have tried every
>> permutation of R0+R1/10’s)
>
> The answer to the first question AFAIK is no.  Do you have the same
> problem with a single --linear array?  What is the result when putting a
> filesystem on each individual drive?  Do you get your 1 million IOPS?
>
> Is MSI enabled and verified to be working for each PCIe SSD device?  See:
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/PCI/MSI-HOWTO.txt;hb=HEAD
>
> --
> Stan
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux