Bill Moran wrote:
In order to recalculate the parity, it has to have data from all disks. Thus,
if you have 4 disks, it has to read 2 (the unknown data blocks included in
the parity calculation) then write 2 (the new data block and the new
parity data) Caching can help some, but if your data ends up being any
size at all, the cache misses become more frequent than the hits. Even
when caching helps, you max speed is still only the speed of a single
disk.
If you have 4 disks, it can do either:
1) Read the old block, read the parity block, XOR the old block with
the parity block and the new block resulting in the new parity block,
write both the new parity block and the new block.
2) Read the two unknown blocks, XOR with the new block resulting in
the new parity block, write both the new parity block and the new block.
You are emphasizing 2 - but the scenario is also overly simplistic.
Imagine you had 10 drives on RAID 5. Would it make more sense to read 8
blocks and then write two (option 2, and the one you describe), or read
two blocks and then write two (option 1). Obviously, if option 1 or
option 2 can be satisfied from cache, it is better to not read at all.
I note that you also disagree with Dave, in that you are not claiming it
performs consistency checks on read. No system does this as performance
would go to the crapper.
Cheers,
mark
--
Mark Mielke <mark@xxxxxxxxx>
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings