On 12/3/10 2:02 AM, Mikael Abrahamsson wrote: > "--assume-clean". Thanks. > Some raid implementations won't read/write to all drives, but might > instead read the block being written to, and the parity block, then > write the new block and recalculate the parity, thus not read/writing to > all blocks. If this is the case, if the parity is wrong, it'll still be > wrong after the operation, thus you don't have any redundancy. Good point. That had occurred to me too but I didn't know if Linux did that. I can see how one might dynamically pick one way or the other depending on how much of the stripe is already in the buffer cache. > Doing a rebuild when creating the array is something I'd only skip if I > was doing lab work, never in production. I use raid for redundancy, thus > I want to make sure everything is ok and it doesn't matter to me if it > takes half a day. I hear you. But I think an important special case is when you're initially loading a new RAID-5 array from an existing (typically smaller) file system that will then be replaced by the new array. Why not let the new array work something like a RAID-0, leaving the parity blocks unwritten until you're finished loading the array? Then pass through the array writing all the parity blocks with the final data. If a drive fails in the new array before you're done, you still have all your original data; you haven't lost anything. Ultimately, RAID-5 in software is always going to be at least somewhat vulnerable because of the lack of an atomic (all or none) committed write of all the blocks in a stripe. This might silently corrupt an old, stable file in a way that you won't notice until a drive fails and you don't have the redundancy you thought you had to reconstruct it. can accept losing whatever files I was writing at the time of a crash, but silent corruption of an old and stable file seems far more insidious. I do periodically run checkarray to ensure that the parities are consistent, but this takes a long time and seems inelegant somehow. Maybe we need software ECC on all data so that one doesn't have to rely on the drive itself to detect errors. Thanks, Phil -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html