Search Postgresql Archives

Re: help troubleshooting invalid page header error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Cory, 

We have zero_damaged_pages parameter in PostgreSQL configuration,by default it is set be off.
To recover data from corrupted table,we can turn on this parameter as a  super user and populate new table using dump or copy utility.

Note : The damaged pages we can't recover from table,it will set to 0 and it will skip while fetching data from table.

Please follow below steps, if decided to recover data from corrupted table.

Sample case :

[postgres@instructor ~]$ /usr/local/pgsql/bin/psql 
psql (9.4rc1)
Type "help" for help.

postgres=# select count(*) from test;
ERROR:  invalid page in block 7 of relation base/13003/16384
postgres=# show zero_damaged_pages;
 zero_damaged_pages 
--------------------
 off
(1 row)

postgres=# set zero_damaged_pages=on;
SET
postgres=# show zero_damaged_pages;
 zero_damaged_pages 
--------------------
 on
(1 row)

postgres=# select count(*) from test;
WARNING:  invalid page in block 7 of relation base/13003/16384; zeroing out page
WARNING:  invalid page in block 8 of relation base/13003/16384; zeroing out page
WARNING:  invalid page in block 9 of relation base/13003/16384; zeroing out page
WARNING:  invalid page in block 10 of relation base/13003/16384; zeroing out page
WARNING:  invalid page in block 11 of relation base/13003/16384; zeroing out page
WARNING:  invalid page in block 12 of relation base/13003/16384; zeroing out page
WARNING:  invalid page in block 13 of relation base/13003/16384; zeroing out page

count  
--------
 979163
(1 row)


On Tue, Dec 23, 2014 at 8:47 AM, Cory Zue <czue@xxxxxxxxxx> wrote:
Hi all,

Our postgres instance on one of our production machines has recently been returning errors of the form "DatabaseError: invalid page header in block 1 of relation base/16384/76623" from normal queries. I've been reading that these are often linked to hardware errors, but I would like to better understand what else it could be or how to determine that for sure. I've filled out the standard issue reporting template below. Any feedback or troubleshooting instructions would be much appreciated.

---
A description of what you are trying to achieve and what results you expect.:

Intermittent queries are failing with the error "DatabaseError: invalid page header in block 1 of relation base/16384/76623"

PostgreSQL version number you are running: 

PostgreSQL 8.4.13 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit

How you installed PostgreSQL: 

from standard package installer

Changes made to the settings in the postgresql.conf file:  


             name             |       current_setting       |        source        
------------------------------+-----------------------------+----------------------
 checkpoint_completion_target | 0.9                         | configuration file
 checkpoint_segments          | 32                          | configuration file
 checkpoint_timeout           | 15min                       | configuration file
 DateStyle                    | ISO, MDY                    | configuration file
 default_text_search_config   | pg_catalog.english          | configuration file
 effective_cache_size         | 1GB                         | configuration file
 lc_messages                  | en_US.UTF-8                 | configuration file
 lc_monetary                  | en_US.UTF-8                 | configuration file
 lc_numeric                   | en_US.UTF-8                 | configuration file
 lc_time                      | en_US.UTF-8                 | configuration file
 log_checkpoints              | on                          | configuration file
 log_connections              | off                         | configuration file
 log_destination              | csvlog                      | configuration file
 log_directory                | /opt/data/pgsql/data/pg_log | configuration file
 log_disconnections           | off                         | configuration file
 log_duration                 | on                          | configuration file
 log_filename                 | postgres-%Y-%m-%d_%H%M%S    | configuration file
 log_lock_waits               | on                          | configuration file
 log_min_duration_statement   | 250ms                       | configuration file
 log_rotation_age             | 1d                          | configuration file
 log_rotation_size            | 1GB                         | configuration file
 log_temp_files               | 0                           | configuration file
 log_timezone                 | Asia/Kolkata                | command line
 log_truncate_on_rotation     | on                          | configuration file
 logging_collector            | on                          | configuration file
 maintenance_work_mem         | 768MB                       | configuration file
 max_connections              | 500                         | configuration file
 max_stack_depth              | 2MB                         | environment variable
 port                         | 5432                        | command line
 shared_buffers               | 4GB                         | configuration file
 ssl                          | on                          | configuration file
 TimeZone                     | Asia/Kolkata                | command line
 timezone_abbreviations       | Default                     | command line
 wal_buffers                  | 16MB                        | configuration file
 work_mem                     | 48MB                        | configuration file

It's also probably worth noting that postgres is installed on an encrypted volume which is mounted using ecryptfs.

Operating system and version:

RedHatEnterpriseServer, version 6.6

What program you're using to connect to PostgreSQL:

Python (django)
 
Is there anything relevant or unusual in the PostgreSQL server logs?:

I see lots of instances of this error (and similar). I'm not sure what else I should be looking for.

What you were doing when the error happened / how to cause the error:

I haven't explicitly tried to reproduce it, but it seems to consistently happen with certain queries. However, the system was rebooted shortly before the errors started occuring. The system was rebooted because another database (elasticsearch) was having problems on the same machine and the reboot was to attempt to resolve things.

The EXACT TEXT of the error message you're getting, if there is one:

DatabaseError: invalid page header in block 1 of relation base/16384/76623

(block and relation numbers change)

Unfortunately, I'm not completely familiar with the CPU and disk/RAID configurations used on the server. However it is storing to a (software) encrypted volume as mentioned above.

  • Have you ever set fsync=off in the postgresql config file?
No
  • Have you had any unexpected power loss lately? Replaced a failed RAID disk? Had an operating system crash?
Not recently, though the system did reboot normally as described above.
  • Have you run a file system check? (chkdsk / fsck)
No.
  • Are there any error messages in the system logs? (unix/linux: dmesg/var/log/syslog ;
I haven't seen anything obvious but I wasn't sure what to look for.

thanks,
Cory


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux