On Fri, Nov 22, 2013 at 9:04 PM, NeilBrown <neilb@xxxxxxx> wrote: > I guess with that many drives you could hit PCI bus throughput limits. > > A 16-lane PCIe 4.0 could just about give 100MB/s to each of 16 devices. So > you would really need top-end hardware to keep all of 16 drives busy in a > recovery. > So yes: rebuilding a drive in a 16-drive RAID6+ would be slower than in e.g. > a 20 drive RAID10. Not really. A single 8x PCIe 2.0 card has 8 x 500MB/s = 4000MB/s of potential bandwidth. That would be 250MB/s per drive for 16 drives. But quite a few people running software RAID with many drives have multiple PCIe cards. For example, in one machine I have three IBM M1015 cards (which I got for $75/ea) that are 8x PCIe 2.0. That comes to 3 x 500MB/s x 8 = 12GB/s of IO bandwidth. Also, your math is wrong. PCIe 3.0 is 985 MB/s per lane. If we assume PCIe 4.0 would double that, we would have 1970MB/s per lane. So one lane of the hypothetical PCIe 4.0 would have enough IO bandwidth to give about 120MB/s to each of 16 drives. A single 8x PCIe 4.0 card would have 8 times that capability which is more than 15GB/s. Even a single 8x PCIe 3.0 card has potentially over 7GB/s of bandwidth. Bottom line is that IO bandwidth is not a problem for a system with prudently chosen hardware. More likely is that you would be CPU limited (rather than bus limited) in a high-parity rebuild where more than one drive failed. But even that is not likely to be too bad, since Andrea's single-threaded recovery code can recover two drives at nearly 1GB/s on one of my machines. I think the code could probably be threaded to achieve a multiple of that running on multiple cores. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html