> 8 SCSI U320 (15000 rpm) disks where 4 disks (sdc, sdd, sde, sdf) figure each is worth, say, 60 MB/s, so you'll peak (theoretically) at 240 MB/s per channel. > The U320 SCSI controller has a 64 bit PCI-X bus for itself, there is no other > device on that bus. Unfortunatly I was unable to determine at what speed > it is running, here the output from lspci -vv: ... > Status: Bus=2 Dev=4 Func=0 64bit+ 133MHz+ SCD- USC-, DC=simple, the "133MHz+" is a good sign. OTOH the latency (72) seems rather low - my understanding is that that would noticably limit the size of burst transfers. > Anyway, I thought with this system I would get theoretically 640 MB/s using > both channels. "theoretically" in the same sense as "according to quantum theory, Bush and BinLadin may swap bodies tomorrow morning at 4:59." > write speeds for this system. But testing shows that the absolute maximum I > can reach with software raid is only approx. 270 MB/s for writting. Which is > very disappointing. it's a bit low, but "very" is unrealistic... > deadline and distribution is fedora core 4 x86_64 with all updates. Chunksize > is always the default from mdadm (64k). Filesystem was always created with the > command mke2fs -j -b4096 -O dir_index /dev/mdx. bear in mind that a 64k chunksize means that an 8 disk raid5 will really only work well for writes that are multiples of of 7*64=448K... > I also have tried with 2.6.13-rc7, but here the speed was much lower, the > maximum there was approx. 140 MB/s for writting. hmm, there should not have been any such dramatic slowdown. > Version 1.03 ------Sequential Output------ --Sequential Input- --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- > Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP > Raid0 (8 disk)15744M 54406 96 247419 90 100752 25 60266 98 226651 29 830.2 1 > Raid0s(4 disk)15744M 54915 97 253642 89 73976 18 59445 97 198372 24 659.8 1 > Raid0s(4 disk)15744M 54866 97 268361 95 72852 17 59165 97 187183 22 666.3 1 you're obviously saturating something already with 2 disks. did you play with "blockdev --setra" setings? > Raid5 (8 disk)15744M 55881 98 153735 51 61680 24 56229 95 207348 44 741.2 1 > Raid5s(4 disk)15744M 55238 98 81023 28 36859 14 56358 95 193030 38 605.7 1 > Raid5s(4 disk)15744M 54920 97 83680 29 36551 14 56917 95 185345 35 599.8 1 the block-read shows that even with 3 disks, you're hitting ~190 MB/s, which is pretty close to your actual disk speed. the low value for block-out is probably just due to non-stripe writes needing R/M/W cycles. > /dev/sdc 15744M 53861 95 102270 35 25718 6 37273 60 76275 8 377.0 0 the block-out is clearly distorted by buffer-cache (too high), but the input rate is good and consistent. obvoiusly, it'll fall off somewhat towards inner tracks, but will probably still be above 50. > Why do I only get 247 MB/s for writting and 227 MB/s for reading (from the > bonnie++ results) for a Raid0 over 8 disks? I was expecting to get nearly > three times those numbers if you take the numbers from the individual disks. expecting 3x is unreasonable; 2x (480 or so) would be good. I suspect that some (sw kernel) components are badly tuned for fast IO. obviously, most machines are in the 50-100 MB/s range, so this is not surprising. readahead is certainly one, but there are also magic numbers in MD as well, not to mention PCI latency, scsi driver tuning, probably even /proc/sys/vm settings. I've got some 4x2.6G opteron servers (same board, 32G PC3200), but alas, end-users have found out about them. not to mention that they only have 3x160G SATA disks... regards, mark hahn. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html