Re: raid5 performance question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wednesday March 22, davidsen@xxxxxxx wrote:
> Neil Brown wrote:
> 
> >On Tuesday March 7, raziebe@xxxxxxxxx wrote:
> >  
> >
> >>Neil.
> >>what is the stripe_cache exacly ?
> >>    
> >>
> >
> >In order to ensure correctness of data, all IO operations on a raid5
> >pass through the 'stripe cache'  This is a cache of stripes where each
> >stripe is one page wide across all devices.
> >
> >e.g. to write a block, we allocate one stripe in the cache to cover
> >that block, pre-read anything that might be needed, copy in the new
> >data and update parity, and write out anything that has changed.
> >  
> >
> I can see that you would have to read the old data and parity blocks for 
> RAID-5, I assume that's what you mean by "might be needed" and not a 
> read of every drive to get the data to rebuild the parity from scratch. 
> That would be not only slower, but require complex error recovery on an 
> error reading unneeded data.

"might be needed" because sometime raid5 reads the old copies of the
blocks it is about to over-write, and sometimes it reads all the
blocks that it is NOT going to over-write instead.  And if it is
over-writing all blocks in the stripe, it doesn't need to read
anything.


> 
> >Similarly to read, we allocate a stripe to cover the block, read in
> >the requires parts, and copy out of the stripe cache into the
> >destination.
> >
> >Requiring all reads to pass through the stripe cache is not strictly
> >necessary, but it keeps the code a lot easier to manage (fewer special
> >cases).   Bypassing the cache for simple read requests when the array
> >is non-degraded is on my list....
> >
> It sounds as if you do a memory copy with each read, even if a read to 
> user buffer would be possible. Hopefully I'm reading that wrong.

Unfortunately you are reading it correctly.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux