On Wednesday March 22, davidsen@xxxxxxx wrote: > Neil Brown wrote: > > >On Tuesday March 7, raziebe@xxxxxxxxx wrote: > > > > > >>Neil. > >>what is the stripe_cache exacly ? > >> > >> > > > >In order to ensure correctness of data, all IO operations on a raid5 > >pass through the 'stripe cache' This is a cache of stripes where each > >stripe is one page wide across all devices. > > > >e.g. to write a block, we allocate one stripe in the cache to cover > >that block, pre-read anything that might be needed, copy in the new > >data and update parity, and write out anything that has changed. > > > > > I can see that you would have to read the old data and parity blocks for > RAID-5, I assume that's what you mean by "might be needed" and not a > read of every drive to get the data to rebuild the parity from scratch. > That would be not only slower, but require complex error recovery on an > error reading unneeded data. "might be needed" because sometime raid5 reads the old copies of the blocks it is about to over-write, and sometimes it reads all the blocks that it is NOT going to over-write instead. And if it is over-writing all blocks in the stripe, it doesn't need to read anything. > > >Similarly to read, we allocate a stripe to cover the block, read in > >the requires parts, and copy out of the stripe cache into the > >destination. > > > >Requiring all reads to pass through the stripe cache is not strictly > >necessary, but it keeps the code a lot easier to manage (fewer special > >cases). Bypassing the cache for simple read requests when the array > >is non-degraded is on my list.... > > > It sounds as if you do a memory copy with each read, even if a read to > user buffer would be possible. Hopefully I'm reading that wrong. Unfortunately you are reading it correctly. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html