>>> On Mon, 10 Mar 2008 09:54:07 +0100, Oliver Martin >>> <oliver.martin@xxxxxxxxxxxxxxxxxxxx> said: [ ... ] > I was talking about stripe size, not chunk size. That 128KB > stripe size is made up of n-1 chunks of an n-disk raid-5. In > this case, 3 disks and 64KB chunk size result in 128KB stripe > size. Uhm, usually I would say that in such a case the stripe size is 192KiB, of which 128KiB are the data capacity/payload. I usually think of the stripe as it is recorded on the array, from the point of view of the RAID software. As you say here: > I assume if you tell the file system about this stripe size > (or it figures it out itself, as xfs does), it tries to align > its structures such that whole-stripe writes are more likely > than partial writes. This means that md only has to write > 3*64KB (2x data + parity). Indeed, indeed the application above the filesystem has to write carefully in 128KiB long, 128KiB aligned (to the start of the array, not the start of the overlaying volume, as you point out) transactions to avoid the high costs you describe here and elsewhere. As I was arguing in a recent post (with very explicit examples), the wider the array (and the larger the chunks size) the worse the cost and chances that the application (usually the file system) manage to put together properly sized and aligned transactions. XFS delayed writes are useful as to that. But then one of the many advantages of RAID10 is that all these complications are largely irrelevant with it... [ ... ] -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html