I renamed the subject as your question doesn't really apply to XFS, or the OP, but to md-RAID. On 12/20/2013 4:43 PM, Arkadiusz Miśkiewicz wrote: > I wonder why kernel is giving defaults that everyone repeatly recommends to > change/increase? Has anyone tried to bugreport that for stripe_cache_size > case? The answer is balancing default md-RAID5/6 write performance against kernel RAM consumption, with more weight given to the latter. The formula: ((4096*stripe_cache_size)*num_drives)= RAM consumed for stripe cache High stripe_cache_size values will cause the kernel to eat non trivial amounts of RAM for the stripe cache buffer. This table demonstrates the effect today for typical RAID5/6 disk counts. stripe_cache_size drives RAM consumed 256 4 4 MB 8 8 MB 16 16 MB 512 4 8 MB 8 16 MB 16 32 MB 1024 4 16 MB 8 32 MB 16 64 MB 2048 4 32 MB 8 64 MB 16 128 MB 4096 4 64 MB 8 128 MB 16 256 MB The powers that be, Linus in particular, are not fond of default settings that create a lot of kernel memory structures. The default md-RAID5/6 stripe_cache-size yields 1MB consumed per member device. With SSDs becoming mainstream, and becoming ever faster, at some point the md-RAID5/6 architecture will have to be redesigned because of the memory footprint required for performance. Currently the required size of the stripe cache appears directly proportional to the aggregate write throughput of the RAID devices. Thus the optimal value will vary greatly from one system to another depending on the throughput of the drives. For example, I assisted a user with 5x Intel SSDs back in January and his system required 4096, or 80MB of RAM for stripe cache, to reach maximum write throughput of the devices. This yielded 600MB/s or 60% greater throughput than 2048, or 40MB RAM for cache. In his case 60MB more RAM than the default was well worth the increase as the machine was an iSCSI target server with 8GB RAM. In the previous case with 5x rust RAID6 the 2048 value seemed optimal (though not yet verified), requiring 40MB less RAM than the 5x Intel SSDs. For a 3 modern rust RAID5 the default of 256, or 3MB, is close to optimal but maybe a little low. Consider that 256 has been the default for a very long time, and was selected back when average drive throughput was much much lower, as in 50MB/s or less, SSDs hadn't yet been invented, and system memories were much smaller. Due to the massive difference in throughput between rust and SSD, any meaningful change in the default really requires new code to sniff out what type of devices constitute the array, if that's possible, and it probably isn't, and set a lowish default accordingly. Again, SSDs didn't exist when md-RAID was coded, nor when this default was set, and this throws a big monkey wrench into these spokes. -- Stan _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs