Hi... sorry for the lack of initial info and your question made me realise how much i had missed off! hopefully this adds some color PCIe based Flash - SLC based Multuiple XEON 5640's (total 16 cores) MSI ints all set (and affinity / pinned tried) SLES 11 (2.6.32.43-0.5) tried on both a Supermicro and and Dell R server the thread is MD0_RAID10 (or something simular as i am not near it now). This thread is easily linked to the MD(s) Create 4 x RAID1's and you have 4 x MD threads etc. So, a single RAID10 creates a single thread - which will max at maybe 200K IOPS. Create 4 x RAID10's seems OK, but they will not scale so great with a RAID0 on top :( Ideal would be a few threads per RAIDx Using basic fio for IOPS (4 workers - 128 QD) - this usess hardly any CPU resource. Reads are maybe 50% faster as you would expect. The issue seems to be the fact a single thread will only deliver X before 100% CPU... with emerging flash, this is not reaching the capability FS: An FS is not really an option for this solution, so we have not tried this on this rig, but in the past the FS has degreaded the IOPS Whilst a R0 on top of the R1/10's does offer some increase in performance, linear does not :( LVM R0 on top of the MD R1/10's does much the same results. The limiter seems fixes on the single thread per R1/10 Thank you for any feedback! Mark On Thu, Aug 11, 2011 at 7:58 PM, Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx> wrote: > On 8/11/2011 10:58 AM, mark delfman wrote: >> I seem to have hit a significant hard stop in MD RAID1/10 performance >> which seems to be linked to a single CPU thread. > > What is the name of the kernel thread that is peaking your cores? Could > the device driver be eating the CPU and not the md kernel threads? Is > it both? Is it a different thread? How much CPU is the IO generator > app eating? > > What Linux kernel version are you running? Which Linux distribution? > What application are you using to generate the IO load? Does it work at > the raw device/partition level or at the file level? > >> I am using extremely high speed (IOPS) internal block devices – 8 in >> total. They are capable of achieving > 1million iops. > > 8 solid state drives of one model or another, probably occupying 8 PCIe > slots. IBIS, VeloDrive, the LSI SSD, or other PCIe based SSD? Or are > these plain SATA II SSDs that *claim* to have 125K 4KB random IOPS > performance? > >> However if I use RAID1 / 10 then MD seems to use a single thread which >> will reach 100% CPU utilisation (single core) at around 200K IOPS. >> Limiting the entire performance to around 200K. > > CPU frequency? How many sockets? Total cores? Whose box? HP, Dell, > IBM, whitebox, self built? If the latter two, whose motherboard? How > many PCIe slots are occupied by the SSD cards? > >> If I use say 4 x RAID1 / 10’s and a RAID0 on top – I see not much >> greater results. (although the theory seems to say I should and there >> are now 4 CPU threads running, it still seems to hit 4 x 100% at maybe >> 350K). > > Assuming you have 4 processors (cores), then yes, you should see better > scaling. If you have less cores than threads, then no. Do you see more > IOPS before running out of CPU when writing vs reading? You should as > you're doing half the IOs when reading. > >> Is there any way to increase the number of threads per RAID set? Or >> any other suggestions on configurations? (I have tried every >> permutation of R0+R1/10’s) > > The answer to the first question AFAIK is no. Do you have the same > problem with a single --linear array? What is the result when putting a > filesystem on each individual drive? Do you get your 1 million IOPS? > > Is MSI enabled and verified to be working for each PCIe SSD device? See: > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/PCI/MSI-HOWTO.txt;hb=HEAD > > -- > Stan > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html