On Wed, Apr 28, 2010 at 9:34 PM, Neil Brown <neilb@xxxxxxx> wrote: > On Tue, 27 Apr 2010 10:18:36 -0700 > Joe Williams <jwilliams315@xxxxxxxxx> wrote: >> The default setting for stripe_cache_size was 256. So 256 x 4K = 1024K >> per device, which would be two stripes, I think (you commented to that >> effect earlier). But somehow the default setting was not optimal for >> sequential write throughput. When I increased stripe_cache_size, the >> sequential write throughput improved. Does that make sense? Why would >> it be necessary to cache more than 2 stripes to get optimal sequential >> write performance? > > The individual devices have some optimal write size - possible one > track or one cylinder (if we pretend those words mean something useful these > days). > To be able to fill that you really need that much cache for each device. > Maybe your drives work best when they are sent 8M (16 stripes, as you say in > a subsequent email) before expecting the first write to complete.. > > You say you get about 250MB/sec, so that is about 80MB/sec per drive > (3 drives worth of data). > Rotational speed is what? 10K? That is 166revs-per-second. Actually, 5400rpm. > So about 500K per revolution. About twice that, about 1 MB per revolution. > I imagine you would need at least 3 revolutions worth of data in the cache, > one that is currently being written, one that is ready to be written next > (so the drive knows it can just keep writing) and one that you are in the > process of filling up. > You find that you need about 16 revolutions (it seems to be about one > revolution per stripe). That is more than I would expect .... maybe there is > some extra latency somewhere. So about 8 revolutions in the cache. 2 to 3 times what might be expected to be needed for optimal performance. Hmmm. 16 stripes comes to 16*512KB per drive, or about 8MB per drive. At about 100MB/s, that is about 80 msec worth of writing. I don't see where 80 msec of latency might come from. Could it be a quirk of NCQ? I think each HDD has an NCQ of 31. But 31 512 byte sectors is only 16KB. That does not seem relevant. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html