On Tuesday January 17, jeff@xxxxxxx wrote: > Is this a real issue or ignorable Sun propoganda? Well.... the 'raid-5 write hole' is old news. It's been discussed on this list several times and doesn't seem to actually stop people getting a lot of value out of software raid5. Nonetheless, their raid-z certainly seems interesting, I though feel the term is misleading. raid-z doesn't provide a virtual storage device in which you can store whatever filesystem you like. raid-z is their code name for a particular aspect of the ZFS filesystem. Though some of these details are guessed and so might be wrong, it probably goes something like this: ZFS uses a 'variable block size' which is probably very similar to what other filesystems call 'extents'. When an extent is written, a hash (aka checksum or MIC - message integrity check) is calculate and stored, probably with the indexing information. This makes it easy to check for media errors. Also the extent is possibly written over various devices, quite possibly at different locations on the different devices. It might be written twice, thus producing effective mirroring. It might be chopped up into bits with the bits written to different devices and a parity block written to another device. This produces an effect similar to raid5. This layout can even be different for different blocks. On a regular (Ext3 like) filesystem this would be very awkward as updating a block would be confusingly hard. However ZFS never updates in place. It is 'copy on write' so any change is written to a new location and updating the indexing and MIC is all part of the package. Not that not only data blocks, but also indirect block and all metadata could be duplicated or striped with parity. This is definitely a clever idea, as are lots of the ideas in ZFS. But just because someone has had a clever idea, that doesn't reduce the value of existing clever ideas like raid5. In general, I think increasing the connection between the filesystem and the volume manager/virtual storage is a good idea. Finding the right balance is not going to be trivial. ZFS has taken one very interesting approach. There are others. I have a feeling the above isn't as coherent as I would like. Maybe I should go to bed.... > > -----Original Message----- > From: I-Gene Leong > Subject: RE: [colo] OT: Server Hardware Recommendations > Date: Mon, 16 Jan 2006 14:10:33 -0800 > > There was an interesting blog entry out in relation to Sun's RAID-Z > talking about RAID-5 shortcomings: > > http://blogs.sun.com/roller/page/bonwick?entry=raid_z > > It sounds to me like RAID-1 would also be vulnerable to the write hole > mentioned inside. The 'write hole' exists for all raid levels with redundancy. The 'resync' process after an unclean shutdown closes the hole, eventually. With raid-5, a drive failure while the hole is open means potential undetectable data loss. With raid-1, a drive failure doesn't imply data loss even during the hole. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html