We appear to have had some corruption on a customer’s postgres cluster. They are on 9.0.17 32bit Windows Server 2003 – Service pack 2 Intel Xeon 2.66GHZ 4GB Memory Raid is setup but doesn’t look good – just now showing status of Degraded!! The RAID doesn’t look too well…. currently has status Degraded and on the Segments tab and it’s showing Segment 1 (Missing)
I guess we can assume the issue is down to hardware... An engineer has been dispatched to replace the hardware and we are arranging to have the cluster shutdown and backed up to a separate storage device. Their postgresql.conf file is pretty much as it comes with only the following line added to the end: custom_variable_classes = 'user_vars' Everything was fine until 13:28 on 7th November when there was a number of these entries in the log: 2014-11-07 13:28:45 GMT WARNING: worker took too long to start; cancelled After that the log file was cycled and it started with: 2014-11-07 14:15:19 GMT FATAL: the database system is starting up 2014-11-07 14:15:20 GMT FATAL: the database system is starting up 2014-11-07 14:15:20 GMT LOG: database system was interrupted; last known up at 2014-11-07 13:28:42 GMT 2014-11-07 14:15:21 GMT FATAL: the database system is starting up 2014-11-07 14:15:22 GMT FATAL: the database system is starting up 2014-11-07 14:15:23 GMT FATAL: the database system is starting up 2014-11-07 14:15:23 GMT LOG: database system was not properly shut down; automatic recovery in progress 2014-11-07 14:15:23 GMT LOG: record with zero length at 5/7B4CAC0 2014-11-07 14:15:23 GMT LOG: redo is not required 2014-11-07 14:15:24 GMT FATAL: the database system is starting up 2014-11-07 14:15:25 GMT FATAL: the database system is starting up 2014-11-07 14:15:25 GMT LOG: database system is ready to accept connections 2014-11-07 14:15:25 GMT LOG: autovacuum launcher started 2014-11-07 14:15:33 GMT LOG: unexpected EOF on client connection Since then whenever trying to write to or query one particular table we receive the following: 2014-11-07 15:13:57 GMT ERROR: invalid page header in block 29838 of relation base/16392/640564 It’s always the same error (block and relation) as far as I can tell. So the question is, what next? We may have lost data as it couldn’t be written but it’s not the end of the world. The more important bit would be to stop any further data loss. Regards, Russell Keane INPS Tel: +44 (0)20 7501 7277 Follow us
on twitter | visit
www.inps.co.uk Registered name: In Practice Systems Ltd. Registered address: The Bread Factory, 1a Broughton Street, London, SW8 3QJ Registered Number: 1788577 Registered in England Visit our Internet Web site at www.inps.co.uk The information in this internet email is confidential and is intended solely for the addressee. Access, copying or re-use of information in it by anyone else is not authorised. Any views or opinions presented are solely those of the author and do not necessarily represent those of INPS or any of its affiliates. If you are not the intended recipient please contact is.helpdesk@xxxxxxxxxx |