On Tue, 2006-01-31 at 12:47 -0800, Luke Lonergan wrote: > Jeffrey, > > On 1/31/06 12:03 PM, "Jeffrey W. Baker" <jwbaker@xxxxxxx> wrote: > > Linux does balanced reads on software > > mirrors. I'm not sure why you think this can't improve bandwidth. It > > does improve streaming bandwidth as long as the platter STR is more than > > the bus STR. > > ... Prove it. It's clear that Linux software RAID1, and by extension RAID10, does balanced reads, and that these balanced reads double the bandwidth. A quick glance at the kernel source code, and a trivial test, proves the point. In this test, sdf and sdg are Seagate 15k.3 disks on a single channel of an Adaptec 39320, but the enclosure, and therefore the bus, is capable of only Ultra160 operation. # grep md0 /proc/mdstat md0 : active raid1 sdf1[0] sdg1[1] # dd if=/dev/md0 of=/dev/null bs=8k count=400000 skip=0 & dd if=/dev/md0 of=/dev/null bs=8k count=400000 skip=400000 400000+0 records in 400000+0 records out 3276800000 bytes transferred in 48.243362 seconds (67922298 bytes/sec) 400000+0 records in 400000+0 records out 3276800000 bytes transferred in 48.375897 seconds (67736211 bytes/sec) That's 136MB/sec, for those following along at home. With only two disks in a RAID1, you can nearly max out the SCSI bus. # dd if=/dev/sdf1 of=/dev/null bs=8k count=400000 skip=0 & dd if=/dev/sdf1 of=/dev/null bs=8k count=400000 skip=400000 400000+0 records in 400000+0 records out 3276800000 bytes transferred in 190.413286 seconds (17208883 bytes/sec) 400000+0 records in 400000+0 records out 3276800000 bytes transferred in 192.096232 seconds (17058117 bytes/sec) That, on the other hand, is only 34MB/sec. With two threads, the RAID1 is 296% faster. # dd if=/dev/md0 of=/dev/null bs=8k count=400000 skip=0 & dd if=/dev/md0 of=/dev/null bs=8k count=400000 skip=400000 & dd if=/dev/md0 of=/dev/null bs=8k count=400000 skip=800000 & dd if=/dev/md0 of=/dev/null bs=8k count=400000 skip=1200000 & 400000+0 records in 400000+0 records out 3276800000 bytes transferred in 174.276585 seconds (18802296 bytes/sec) 400000+0 records in 400000+0 records out 3276800000 bytes transferred in 181.581893 seconds (18045852 bytes/sec) 400000+0 records in 400000+0 records out 3276800000 bytes transferred in 183.724243 seconds (17835425 bytes/sec) 400000+0 records in 400000+0 records out 3276800000 bytes transferred in 184.209018 seconds (17788489 bytes/sec) That's 71MB/sec with 4 threads... # dd if=/dev/sdf1 of=/dev/null bs=8k count=100000 skip=0 & dd if=/dev/sdf1 of=/dev/null bs=8k count=100000 skip=400000 & dd if=/dev/sdf1 of=/dev/null bs=8k count=100000 skip=800000 & dd if=/dev/sdf1 of=/dev/null bs=8k count=100000 skip=1200000 & 100000+0 records in 100000+0 records out 819200000 bytes transferred in 77.489210 seconds (10571794 bytes/sec) 100000+0 records in 100000+0 records out 819200000 bytes transferred in 87.628000 seconds (9348610 bytes/sec) 100000+0 records in 100000+0 records out 819200000 bytes transferred in 88.912989 seconds (9213502 bytes/sec) 100000+0 records in 100000+0 records out 819200000 bytes transferred in 90.238705 seconds (9078144 bytes/sec) Only 36MB/sec for the single disk. 96% advantage for the RAID1. # dd if=/dev/md0 of=/dev/null bs=8k count=50000 skip=0 & dd if=/dev/md0 of=/dev/null bs=8k count=50000 skip=400000 & dd if=/dev/md0 of=/dev/null bs=8k count=50000 skip=800000 & dd if=/dev/md0 of=/dev/null bs=8k count=50000 skip=1200000 & dd if=/dev/md0 of=/dev/null bs=8k count=50000 skip=1600000 & dd if=/dev/md0 of=/dev/null bs=8k count=50000 skip=2000000 & dd if=/dev/md0 of=/dev/null bs=8k count=50000 skip=2400000 & dd if=/dev/md0 of=/dev/null bs=8k count=50000 skip=2800000 & 50000+0 records in 50000+0 records out 409600000 bytes transferred in 35.289648 seconds (11606803 bytes/sec) 50000+0 records in 50000+0 records out 409600000 bytes transferred in 42.653475 seconds (9602969 bytes/sec) 50000+0 records in 50000+0 records out 409600000 bytes transferred in 43.524714 seconds (9410745 bytes/sec) 50000+0 records in 50000+0 records out 409600000 bytes transferred in 45.151705 seconds (9071640 bytes/sec) 50000+0 records in 50000+0 records out 409600000 bytes transferred in 47.741845 seconds (8579476 bytes/sec) 50000+0 records in 50000+0 records out 409600000 bytes transferred in 48.600533 seconds (8427891 bytes/sec) 50000+0 records in 50000+0 records out 409600000 bytes transferred in 48.758726 seconds (8400548 bytes/sec) 50000+0 records in 50000+0 records out 409600000 bytes transferred in 49.679275 seconds (8244887 bytes/sec) 66MB/s with 8 threads. # dd if=/dev/sdf1 of=/dev/null bs=8k count=50000 skip=0 & dd if=/dev/sdf1 of=/dev/null bs=8k count=50000 skip=400000 & dd if=/dev/sdf1 of=/dev/null bs=8k count=50000 skip=800000 & dd if=/dev/sdf1 of=/dev/null bs=8k count=50000 skip=1200000 & dd if=/dev/sdf1 of=/dev/null bs=8k count=50000 skip=1600000 & dd if=/dev/sdf1 of=/dev/null bs=8k count=50000 skip=2000000 & dd if=/dev/sdf1 of=/dev/null bs=8k count=50000 skip=2400000 & dd if=/dev/sdf1 of=/dev/null bs=8k count=50000 skip=2800000 & 50000+0 records in 50000+0 records out 409600000 bytes transferred in 73.873911 seconds (5544583 bytes/sec) 50000+0 records in 50000+0 records out 409600000 bytes transferred in 75.613093 seconds (5417051 bytes/sec) 50000+0 records in 50000+0 records out 409600000 bytes transferred in 79.988303 seconds (5120749 bytes/sec) 50000+0 records in 50000+0 records out 409600000 bytes transferred in 79.996440 seconds (5120228 bytes/sec) 50000+0 records in 50000+0 records out 409600000 bytes transferred in 84.885172 seconds (4825342 bytes/sec) 50000+0 records in 50000+0 records out 409600000 bytes transferred in 92.995892 seconds (4404496 bytes/sec) 50000+0 records in 50000+0 records out 409600000 bytes transferred in 99.180337 seconds (4129851 bytes/sec) 50000+0 records in 50000+0 records out 409600000 bytes transferred in 100.144752 seconds (4090080 bytes/sec) 33MB/s. RAID1 gives a 100% advantage at 8 threads. I think I've proved my point. Software RAID1 read balancing provides 0%, 300%, 100%, and 100% speedup on 1, 2, 4, and 8 threads, respectively. In the presence of random I/O, the results are even better. Anyone who thinks they have a single-threaded workload has not yet encountered the autovacuum daemon. -Jeff