Neil Brown (neilb@xxxxxxx) wrote on 19 November 2005 16:54: >There are two solutions to this silent corruption problem (other than >'ignore it and hope it doesn't bite' which is a fair widely used >solution, and I haven't seen any bite marks myself). It happened to me several years ago when two disks failed almost simultaneously due to scsi bus problems. I had to re-assemble the array anyway and some files got corrupted :-( That's why I ended up having each disk on an independent bus and cable... >One is journalling, as has been mentioned. This could be done to a >mirrored pair, or to a ECC NVRAM card (the latter being probably the >best, though also most expensive). You would write each data block as >it becomes available, and each parity block just before commencing a >write to the raid5. Obviously you also keep track of what you have >written. >I have toyed with the idea of implementing this, but I think demand is >sufficiently low that it isn't worth it. > >The other is to use a filesystem that allows the problem to be avoided >by making sure that the only blocks that can be corrupted are dead >blocks. >This could be done with a copy-on-write filesystem that knows about the >raid5 geometry, and only ever writes to a stripe when no other blocks >on the stripe contain live data. >I've been working on a filesystem which does just this, and hope to >have it available in a year or two (it is a back-ground 'hobby' >project). I think the demand for any solution to the unclean array is indeed low because of the small probability of a double failure. Those that want more reliability can use a spare drive that resyncs automatically or raid6 (or both). - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html