On Wed, 2005-10-26 at 13:38, Wes Williams wrote: > Even with a primary UPS on the *entire PostgreSQL server* does one still > need, or even still recommend, a battery-backed cache on the RAID controller > card? [ref SCSI 320, of course] > > If so, I'd be interest in knowing briefly why. I'll tell you a quick little story. Got a new server, aged out the old one. new server was a dual P-IV 2800 with 2 gigs ram and a pair of 36 gig U320 drives in a RAID-1 mirror under a battery backed cache. This machine also had four 120 gig IDE drives for file storage. But the database was on the dual SCSIs under the RAID controller. I tested it with the power off test, etc... And it passed with flying colors. Put it into production. Many other servers, including our Oracle servers, were not tested in this way. This machine had dual redundant power supplies with separate power cables running into two separate rails, each running off of a different UPS. The UPSes were fed by power conditioners, and there was a switch on the other side of that to switch us over to diesel generators should the power go out. The UPSes were quite large, and even with a hundred or so computers in the hosting center, there was about 3 hours of battery time before the diesel generator HAD to be up or we'd lose power. Seems pretty solid, right? We're talking a multi million dollar hosting center, the kind with an ops center that looks like the deck of the Enterprise. Raised floors, everything. Fast forward six months. An electrician working on the wiring in the ceiling above one of the power conditioners clips off a tiny piece of wire. Said tiny piece of wire drops into the power conditioner. Said power conditioner overloads, and trips the other two power conditioners in the hosting center. This also blew out the master controller on the UPS setup, so it didn't come up. The switch for the Diesel generator would have switched over, but it was fried too. The UPSes, luckily, were the constant on variety, so they took the hit for the computers on the other side of them, about half the UPSes were destroyed. After about 3 hours, we had enough of the power jury rigged to bring the systems back up. In a company with dozens and dozens, ranging from MySQL to Oracle to PostgreSQL to Ingres to MSSQL to interbase to foxpro, exactly one of our database servers came up without any errors. You already know which one it was, or I wouldn't be writing this letter. Power supplies fail, UPSes fail, hard drives fail, and raid controllers and batter backed caches fail. You can remove every possibility of failure, but you can limit the number of things that can harm you should they fail. I do know that after that outage, I never once got shit for using postgresql ever again from anybody. The sad thing is, if any of those other machines had had battery backed raid controllers with local storage (many were running on NFS or SMB mounts) they would have been fine too. But many of the DBAs for those other databases had the same "who needs to worry about sudden power off when we have UPSes and power conditioners." You can guess what optional feature suddenly seemed like a good idea for every new database server after that. ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your message can get through to the mailing list cleanly