just to understand... i didn't think about a implementation yet... what could be done to 'multi thread' md raid1,10,5,6? i didn't understand why it is a problem, i think that the only cpu time that it need is the time to tell what disk and what position must be read for each i/o request i'm just thinking about the normal read/write without resync, check, bad read/write, or another management feature running 2012/5/23 Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>: > On 5/22/2012 2:29 AM, David Brown wrote: > >> But in general, it's important to do some real-world testing to >> establish whether or not there really is a bottleneck here. It is >> counter-productive for Stan (or anyone else) to advise against raid10 or >> raid5/6 because of a single-thread bottleneck if it doesn't actually >> slow things down in practice. > > Please reread precisely what I stated earlier: > > "Neil pointed out quite some time ago that the md RAID 1/5/6/10 code > runs as a single kernel thread. Thus when running heavy IO workloads > across many rust disks or a few SSDs, the md thread becomes CPU bound, > as it can only execute on a single core, just as with any other single > thread." > > Note "heavy IO workloads". The real world testing upon which I based my > recommendation is in this previous thread on linux-raid, of which I was > a participant. > > Mark Delfman did the testing which revealed this md RAID thread > scalability problem using 4 PCIe enterprise SSDs: > > http://marc.info/?l=linux-raid&m=131307849530290&w=2 > >> On the other hand, if it /is/ a hinder to >> scaling, then it is important for Neil and other experts to think about >> how to change the architecture of md raid to scale better. And > > More thorough testing and identification of the problem is definitely > required. Apparently few people are currently running md RAID 1/5/6/10 > across multiple ultra high performance SSDs, people who actually need > every single ounce of IOPS out of each device in the array. But this > trend will increase. I'd guess those currently building md 1/5/6/10 > arrays w/ many SSDs simply don't *need* every ounce of IOPS, or more > would be complaining about single core thread limit already. > >> somewhere in between there can be guidelines to help users - something >> like "for an average server, single-threading will saturate raid5 >> performance at 8 disks, raid6 performance at 6 disks, and raid10 at 10 >> disks, beyond which you should use raid0 or linear striping over two or >> more arrays". > > This isn't feasible due to the myriad possible combinations of hardware. > And you simply won't see this problem with SRDs (spinning rust disks) > until you have hundreds of them in a single array. It requires over 200 > 15K SRDs in RAID 10 to generate only 30K random IOPS. Just about any > single x86 core can handle that, probably even a 1.6GHz Atom. This > issue mainly affects SSD arrays, where even 8 midrange consumer SATA3 > SSDs in RAID 10 can generate over 400K IOPS, 200K real and 200K mirror data. > >> Of course, to do such testing, someone would need a big machine with >> lots of disks, which is not otherwise in use! > > Shouldn't require anything that heavy. I would guess that one should be > able to reveal the thread bottleneck with a low freq dual core desktop > system with an HBA such as the LSI 9211-8i @320K IOPS, and 8 Sandforce > 2200 based SSDs @40K write IOPS each. > > -- > Stan > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html