Hello, Today our standby instance stopped working with this error in the log: 2013-06-22 16:27:32 UTC [8367]: [247-1] [] WARNING: page 158130 of relation pg_tblspc/16447/PG_9.2_201204301/16448/39154429 is uninitialized 2013-06-22 16:27:32 UTC [8367]: [248-1] [] CONTEXT: xlog redo vacuum: rel 16447/16448/39154429; blk 158134, lastBlockVacuumed 158129 2013-06-22 16:27:32 UTC [8367]: [249-1] [] PANIC: WAL contains references to invalid pages 2013-06-22 16:27:32 UTC [8367]: [250-1] [] CONTEXT: xlog redo vacuum: rel 16447/16448/39154429; blk 158134, lastBlockVacuumed 158129 2013-06-22 16:27:32 UTC [8366]: [3-1] [] LOG: startup process (PID 8367) was terminated by signal 6: Aborted 2013-06-22 16:27:32 UTC [8366]: [4-1] [] LOG: terminating any other active server processes After re-start the same exact error occurred. We thought that maybe we hit this bug - http://postgresql.1045698.n5.nabble.com/Completely-broken-replica-after-PANIC-WAL-contains-references-to-invalid-pages-td5750072.html. However, there is nothing in our log about sub-transactions, so it didn't seem the same to us. Any advice on how to further debug this so we can avoid this in the future is appreciated. Environment: AWS, High I/O instance (hi1.4xlarge), 60GB RAM Software and settings: PostgreSQL 9.2.4 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2, 64-bit archive_command rsync -a %p slave:/var/lib/postgresql/replication_load/%f archive_mode on autovacuum_freeze_max_age 1000000000 autovacuum_max_workers 6 checkpoint_completion_target 0.9 checkpoint_segments 128 checkpoint_timeout 30min default_text_search_config pg_catalog.english hot_standby on lc_messages en_US.UTF-8 lc_monetary en_US.UTF-8 lc_numeric en_US.UTF-8 lc_time en_US.UTF-8 listen_addresses * log_checkpoints on log_destination stderr log_line_prefix %t [%p]: [%l-1] [%h] log_min_duration_statement -1 log_min_error_statement error log_min_messages error log_timezone UTC maintenance_work_mem 1GB max_connections 1200 max_standby_streaming_delay 90s max_wal_senders 5 port 5432 random_page_cost 2 seq_page_cost 1 shared_buffers 4GB ssl off ssl_cert_file /etc/ssl/certs/ssl-cert-snakeoil.pem ssl_key_file /etc/ssl/private/ssl-cert-snakeoil.key synchronous_commit off TimeZone UTC wal_keep_segments 128 wal_level hot_standby work_mem 8MB root@ip-10-148-131-236:~# /usr/local/pgsql/bin/pg_controldata /usr/local/pgsql/data pg_control version number: 922 Catalog version number: 201204301 Database system identifier: 5838668587531239413 Database cluster state: in archive recovery pg_control last modified: Sat 22 Jun 2013 06:13:07 PM UTC Latest checkpoint location: 2250/18CA0790 Prior checkpoint location: 2250/18CA0790 Latest checkpoint's REDO location: 224F/E127B078 Latest checkpoint's TimeLineID: 2 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 1/2018629527 Latest checkpoint's NextOID: 43086248 Latest checkpoint's NextMultiXactId: 7088726 Latest checkpoint's NextMultiOffset: 20617234 Latest checkpoint's oldestXID: 1690316999 Latest checkpoint's oldestXID's DB: 16448 Latest checkpoint's oldestActiveXID: 2018629527 Time of latest checkpoint: Sat 22 Jun 2013 03:24:05 PM UTC Minimum recovery ending location: 2251/5EA631F0 Backup start location: 0/0 Backup end location: 0/0 End-of-backup record required: no Current wal_level setting: hot_standby Current max_connections setting: 1200 Current max_prepared_xacts setting: 0 Current max_locks_per_xact setting: 64 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value root@ip-10-148-131-236:~# Thanks again. Dan |