Re: Invalid Page Headers

"Thomas F. O'Connell" <tfo@xxxxxxxxxxxx> · Tue, 18 Apr 2006 13:19:48 -0500

On Apr 18, 2006, at 12:30 PM, Thomas F. O'Connell wrote:

So there are currently three separate relations exhibiting invalid  
page errors.

This box is a Debian 3.1 box running a custom Linux 2.6.10 #6 SMP  
kernel. Postgres 8.1.3 was compiled from source. pgpool 3.0.1, also  
built from source, is used by some parts of the application layer.  
The system is running on an ext3 filesystem, WAL is on a 4-disk  
RAID 10 running JFS, and data is on a 12-disk RAID 10 running JFS.  
I'm not seeing any signs of apparent kernel or hardware errors in  
the system and kernel logs.

I take back the lack of errors. megamgr is now reporting 5 (!) failed  
drives on a single channel in the RAID 10 for data. The RAID card is  
a MegaRAID SCSI 320-2X.

I would've expected the RAID to protect postgres from the possibility  
of data corruption, but I guess not.

In any event, we're working on replacing the failed drives. After the  
RAID is rebuilt, though, the focus will be on data. Is my best bet to  
restore the corrupted relations, or can I repair them somehow?

And I'm still concerned about whether postgres will recover if I stop  
it at this point, so I'm working on contingency plans for leaving  
postgres online, turning off the application, restoring the tables  
while nothing is accessing postgres, and then restarting the  
application. Is there a safer/better course of action available?

--
Thomas F. O'Connell
Database Architecture and Programming
Sitening, LLC

http://www.sitening.com/
3004 B Poston Avenue
Nashville, TN 37203-1314
615-260-0005 (cell)
615-469-5150 (office)
615-469-5151 (fax)