On Aug 18, 2012, at 9:17 PM, Stan Hoeppner wrote: >> > > Yes, as in the case of XFS journal alignment, where the maximum stripe > unit (chunk) size is 256KB and the recommended size is 32KB. This is a > 100% metadata workload, making full stripe writes difficult even with a > small stripe unit (chunk). Large chunks simply make it much worse. And > every modern filesystem uses a journal… I agree that a bigger chunk size is not inherently better. I suspect 512K is selected for the default because for most people storage loads, which aren't spectacularly heavy (either data or metadata). But all the documentation I find on mdadm fairly well hits home that to get the best performance, you have to test. One small quibble, however, is that the three newest filesystems, don't use journals: ZFS, btrfs, ReFS. > >> Overall, I think 512Kb is quite a good chunk size, even for a raid5 >> array. > > I emphatically disagree. For the vast majority of workloads, with a > 512KB chunk RAID5/6, nearly every write will trigger RMW, and RMW is > what kills parity array performance. And RMW is *far* more costly than > sending smaller vs larger IOs to the drives. I thought that default seemed a bit high, but I'll bet you dollars to donuts the vast majority of workloads using default settings for parity RAID, are 4+MB files like music and video. I think if you get a really busy mail server, lots of tiny files, then you've got a pretty strong case that 512K across maybe 6 disks, is going to lead to a lot of unnecessary RMW, and a lower chunk size will help a lot. Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html