On Fri, Jul 1, 2011 at 2:18 PM, Mikko Partio <mpartio@xxxxxxxxx> wrote: > Hello list > I have two a machine cluster with PostgreSQL 9.0.4 and streaming > replication. In a normal situation I did a failover -- touched the trigger > file in standby to promote it to production mode. I have done this > previously without any complications but now the to-be-production-database > had a PANIC and shut itself down. From the logs: > postgres[10751]: [2-1] 2011-06-30 17:25:24 EEST [10751]: [1-1] user=,db= > LOG: streaming replication successfully connected to primary > postgres[10736]: [10-1] 2011-07-01 07:50:29 EEST [10736]: [10-1] user=,db= > LOG: trigger file found: /postgresql/data/finish_replication.trigger > postgres[10751]: [3-1] 2011-07-01 07:50:29 EEST [10751]: [2-1] user=,db= > FATAL: terminating walreceiver process due to administrator command > postgres[10736]: [11-1] 2011-07-01 07:50:30 EEST [10736]: [11-1] user=,db= > LOG: redo done at AE/8B1855C8 > postgres[10736]: [12-1] 2011-07-01 07:50:30 EEST [10736]: [12-1] user=,db= > LOG: last completed transaction was at log time 2011-07-01 > 07:50:14.67424+03 > postgres[10736]: [13-1] 2011-07-01 07:50:30 EEST [10736]: [13-1] user=,db= > LOG: restored log file "00000006000000AE0000008B" from archive > postgres[10736]: [14-1] 2011-07-01 07:50:30 EEST [10736]: [14-1] user=,db= > LOG: selected new timeline ID: 7 > postgres[10736]: [15-1] 2011-07-01 07:50:30 EEST [10736]: [15-1] user=,db= > LOG: restored log file "00000006.history" from archive > postgres[10736]: [16-1] 2011-07-01 07:50:30 EEST [10736]: [16-1] user=,db= > LOG: archive recovery complete > postgres[10736]: [17-1] 2011-07-01 07:50:30 EEST [10736]: [17-1] user=,db= > WARNING: page 68975 of relation base/3711580/4398155 was uninitialized > postgres[10736]: [18-1] 2011-07-01 07:50:30 EEST [10736]: [18-1] user=,db= > WARNING: page 782 of relation base/3711580/4395394 was uninitialized > postgres[10736]: [19-1] 2011-07-01 07:50:30 EEST [10736]: [19-1] user=,db= > WARNING: page 68948 of relation base/3711580/4398155 was uninitialized > postgres[10736]: [20-1] 2011-07-01 07:50:30 EEST [10736]: [20-1] user=,db= > WARNING: page 68986 of relation base/3711580/4398155 was uninitialized > [ there are maybe 50 lines of these WARNING messages for all together six > different relations, all indexes ] > postgres[10736]: [73-1] 2011-07-01 07:50:31 EEST [10736]: [73-1] user=,db= > PANIC: WAL contains references to invalid pages > postgres[10727]: [2-1] 2011-07-01 07:50:31 EEST [10727]: [2-1] user=,db= > LOG: startup process (PID 10736) was terminated by signal 6: Aborted > postgres[10727]: [3-1] 2011-07-01 07:50:31 EEST [10727]: [3-1] user=,db= > LOG: terminating any other active server processes > Then I started postgres as it was shut down: > postgres[12191]: [1-1] 2011-07-01 07:50:59 EEST [12191]: [1-1] user=,db= > LOG: database system was interrupted while in recovery at log time > 2011-07-01 07:44:02 EEST > postgres[12191]: [1-2] 2011-07-01 07:50:59 EEST [12191]: [2-1] user=,db= > HINT: If this has occurred more than once some data might be corrupted and > you might need to choose an earlier recovery target. > postgres[12191]: [2-1] 2011-07-01 07:50:59 EEST [12191]: [3-1] user=,db= > LOG: database system was not properly shut down; automatic recovery in > progress > postgres[12191]: [3-1] 2011-07-01 07:50:59 EEST [12191]: [4-1] user=,db= > LOG: redo starts at AE/8B13E398 > postgres[12191]: [4-1] 2011-07-01 07:50:59 EEST [12191]: [5-1] user=,db= > LOG: consistent recovery state reached at AE/8C000000 > postgres[12191]: [5-1] 2011-07-01 07:50:59 EEST [12191]: [6-1] user=,db= > LOG: unexpected pageaddr AE/6E000000 in log file 174, segment 140, offset 0 > postgres[12191]: [6-1] 2011-07-01 07:50:59 EEST [12191]: [7-1] user=,db= > LOG: redo done at AE/8B1855C8 > postgres[12191]: [7-1] 2011-07-01 07:50:59 EEST [12191]: [8-1] user=,db= > LOG: last completed transaction was at log time 2011-07-01 > 07:50:14.67424+03 > postgres[12194]: [1-1] 2011-07-01 07:50:59 EEST [12194]: [1-1] user=,db= > LOG: autovacuum launcher started > postgres[12189]: [1-1] 2011-07-01 07:50:59 EEST [12189]: [1-1] user=,db= > LOG: database system is ready to accept connections > Are these log messages something to worry about? http://archives.postgresql.org/pgsql-hackers/2008-12/msg01335.php According to the above thread, this issue seems to exist for a few years, and be unsolved yet. Could you provide a self-contained test case? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin