Simon Riggs wrote:
then I updated the master with a batch of inserts, but after a while the
slave stopped with
these messages:
LOG: restored log file "000000010000000000000021" from archive
LOG: record with zero length at 0/21000048
LOG: invalid primary checkpoint record
LOG: restored log file "000000010000000000000020" from archive
LOG: restored log file "000000010000000000000021" from archive
LOG: invalid resource manager ID in secondary checkpoint record
PANIC: could not locate a valid checkpoint record
LOG: startup process (PID 19619) was terminated by signal 6
LOG: aborting startup due to startup process failure
Please run pg_controldata to print out the control file.
Hi, sorry for the long delay.
First of all I had to stop postgres with pg_ctl stop -s immediate, or it
wouldn't die because of the ongoing replication.
This is the output of pg_controldata:
postgres@www3:/usr/local/postgres_replica/data$ pg_controldata
/usr/local/postgres_replica/data/
pg_control version number: 812
Catalog version number: 200510211
Database system identifier: 5001030714849737714
Database cluster state: in recovery
pg_control last modified: Fri 27 Apr 2007 13:20:46 CEST
Current log file ID: 0
Next log file segment: 26
Latest checkpoint location: 0/190C7E04
Prior checkpoint location: 0/190C7DC0
Latest checkpoint's REDO location: 0/190C7E04
Latest checkpoint's UNDO location: 0/0
Latest checkpoint's TimeLineID: 1
Latest checkpoint's NextXID: 3698809
Latest checkpoint's NextOID: 68745
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Time of latest checkpoint: Fri 27 Apr 2007 11:53:47 CEST
Maximum data alignment: 4
Database block size: 8192
Blocks per segment of large relation: 131072
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Date/time type storage: floating-point numbers
Maximum length of locale name: 128
LC_COLLATE: C
LC_CTYPE: C
Backup all the files in case we need to inspect them.
ok
What was the ending log sequence number (e.g. x/xxxx) from the previous
recovery? I'll see if I can re-create this.
judging from the logs I gues it is 0/190C7E04:
LOG: restored log file "000000010000000000000019.000C7E04.backup" from
archive
LOG: restored log file "000000010000000000000019" from archive
LOG: checkpoint record is at 0/190C7E04
LOG: redo record is at 0/190C7E04; undo record is at 0/0; shutdown FALSE
LOG: next transaction ID: 3698809; next OID: 68745
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: automatic recovery in progress
LOG: redo starts at 0/190C7E48
What did I do wrong? Is there any other procedure to follow to restart a
stopped replication?
You're right, using the trigger is not the right way to stop/start the
standby. Just stop/start the standby server normally.
as above: a plain stop hangs
The trigger means that you'd like to perform a failover.
There is a patch not yet applied which will make a new version of
pg_standby. pg_standby's official status right now is beta, so please
expect, look for and report any issues you find. Thanks.
thank you