david@xxxxxxx wrote: > just duplicating the Raid 4 or 5 pairity information will not help you > if the parity drive is not one of the drives that fail. Good point - and no doubt why nothing supports extra disks worth of parity on RAID 5, which would be entirely useless (still only protecting against a 1-disk failure but wasting more space). Except, apparently, the earlier poster's RAID 5 controller that DOES support extra parity disks. It must just be hot spares, nothing else makes any sense. > even this isn't completely error proof. I just went through a scare with > a 15 disk array where it reported 3 dead drives after a power outage. > one of the dead drives ended up being the hot-spare, and another drive > that acted up worked well enough to let me eventually recover all the > data (seek errors), but it was a very scary week while I worked through > this. As file systems can be corrupted, files deleted, etc, I try to make sure that all my data is sufficiently well backed up that a week's worth of recovery effort is never needed. Dead array? Rebuild and restore from backups. Admittedly this practice has arisen because of a couple of scares much like you describe, but at least now it happens. I even test the backups ;-) Big SATA 7200rpm disks are so cheap compared to high performance SAS or even 10kRPM SATA disks that it seems like a really bad idea not to have a disk-based backup server with everything backed up quick to hand. For that reason I'm absolutely loving PostgreSQL's archive_wal feature and support for a warm spare server. I can copy the WAL files to another machine and immediately restore them there (providing a certain level of inherent testing) as well as writing them to tape. It's absolutely wonderful. Sure, the warm spare will run like a man in knee-deep mud, but it'll do in an emergency. The existing "database" used by the app I'm working to replace is an ISAM-based single host shared-file DB where all the user processes access the DB directly. Concurrency is only supported through locking, there's no transaction support, referential integrity checking, data typing, no SQL of any sort, AND it's prone to corruption and data loss if a user process is killed. User processes are killed when the user's terminal is closed or loses its connection. Backups are only possible once a day when all users are logged off. It's not an application where losing half a day of data is fun. On top of all that it runs on SCO OpenServer 5.0.5 (which has among other things the most broken C toolchain I've ever seen). So ... hooray for up-to-date, well tested backups and how easy PostgreSQL makes them. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance