On 1/2/2014 8:49 AM, joystick wrote: > For a 4k write in raid5, two 4k sectors are read, then > two 4k sectors are written, and this is completely independent from > chunk size. First, there is no such thing as a 4K sector in Linux. Sectors are 512 bytes. Filesystem blocks and memory pages are 4K. I'm no expert WRT raid5.c/raid6.c, but I'm pretty sure it doesn't work as you state. I'm pretty sure it works like this: Redundancy is maintained at the chunk level, not the filesystem block level or page level. If modifying a single filesystem block, md will read the data chunk of the stripe in which the 4 sectors of the 4KB block resides, write back the chunk incorporating the changes to the 4 sectors, read the parity chunk, recalculate the parity chunk based on the new data chunk, and then write back the parity chunk. This is precisely why many folks, including myself, consider the current 512KB chunk default to be way too high. Modifying a single 4KB filesystem block requires reading 1MB from disk and writing 1MB, a total of 2MB of IO just to modify a single 4KB page. And AFAIK this is the best case scenario. According to past posts by Neil, IIRC, the current RAID5/6 code may read more than just two chunks during RMW depending on certain factors. With RAID6 you have at least one extra chunk write, if not an extra chunk read, so your IO is at least 2.5MB for a single 4K write with RAID6. -- Stan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html