Search Postgresql Archives

After upgrade to 9.3, streaming replication fails to start

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I recently installed 9.3 into my work servers.

With 9.2 I had streaming replication working. Primary server is on nirvana, standby server is on dukkha.

Upgrade on nirvana went just fine--no problems there. Initial installation on dukkha went fine as well, but streaming replication fails to start with the error "PANIC: could not locate a valid checkpoint record".

Here's what I'm doing to start streaming replication:

On the standby:

_postgresql@dukkha:/var/postgresql $ initdb -D /var/postgresql/data.93.5432
The files belonging to this database system will be owned by user "_postgresql".
This user must also own the server process.

The database cluster will be initialized with locale "C".
The default database encoding has accordingly been set to "SQL_ASCII".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /var/postgresql/data.93.5432 ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
creating configuration files ... ok
creating template1 database in /var/postgresql/data.93.5432/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating collations ... not supported on this platform
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
loading PL/pgSQL server-side language ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... ok
copying template1 to postgres ... ok
syncing data to disk ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    postgres -D /var/postgresql/data.93.5432
or
    pg_ctl -D /var/postgresql/data.93.5432 -l logfile start

_postgresql@dukkha:/var/postgresql $ cp pg_hba.conf data.93.5432/
_postgresql@dukkha:/var/postgresql $ echo "include '/var/postgresql/my_postgresql.conf'" >> data.93.5432/postgresql.conf
_postgresql@dukkha:/var/postgresql $ tail -1 data.93.5432/postgresql.conf
include '/var/postgresql/my_postgresql.conf'
_postgresql@dukkha:/var/postgresql $ cat my_postgresql.conf
#my settings
listen_addresses = 'localhost,10.31.16.3' # what IP address(es) to listen on; unix_socket_directories = '/var/postgresql/' # (change requires restart)
log_line_prefix = '<%u%%%d> '                   # special values:
log_timezone = 'America/Denver'
timezone = 'America/Denver'
log_statement = 'all'                   # none, ddl, mod, all
archive_mode = on
wal_level = hot_standby                 # minimal, archive, or hot_standby
archive_command = 'cd .'
default_statistics_target = 50 # pgtune wizard 2009-12-19
maintenance_work_mem = 120MB # pgtune wizard 2009-12-19
constraint_exclusion = on # pgtune wizard 2009-12-19
checkpoint_completion_target = 0.9 # pgtune wizard 2009-12-19
effective_cache_size = 1408MB # pgtune wizard 2009-12-19
work_mem = 12MB # pgtune wizard 2009-12-19
wal_buffers = 8MB # pgtune wizard 2009-12-19
checkpoint_segments = 16 # pgtune wizard 2009-12-19
shared_buffers = 480MB # pgtune wizard 2009-12-19
max_connections = 100 # pgtune wizard 2009-12-19
hot_standby = on

_postgresql@dukkha:/var/postgresql $ cat recovery.conf
standby_mode = 'on'
primary_conninfo = 'host=nirvana.internal port=5432 user=wykids_replication'
restore_command = 'cp /wal/%f %p'
archive_cleanup_command = 'pg_archivecleanup /wal %r'

Switch to the primary:

_postgresql@nirvana:/var/postgresql $ cat start_hot_standby.sh
#!/bin/sh
backup_label=wykids_`date +%Y-%m-%d`
#remove any existing wal files on the standby
ssh dukkha.internal rm -rf /wal/*
#stop the standby server if it is running
ssh dukkha.internal sudo /usr/local/bin/svc -d /service/postgresql.5432
psql -c "select pg_start_backup('$backup_label');" template1
rsync \
        --copy-links \
        --delete \
        --exclude=backup_label \
        --exclude=postgresql.conf \
        --exclude=recovery.done \
        -e ssh -avz /var/postgresql/data.93.5432/ \
        dukkha.internal:/var/postgresql/data.93.5432/
ssh dukkha.internal rm -f /var/postgresql/data.93.5432/pg_xlog/*
ssh dukkha.internal rm -f /var/postgresql/data.93.5432/pg_xlog/archive_status/*
ssh dukkha.internal rm -f /var/postgresql/data.93.5432/pg_log/*
ssh dukkha.internal rm -f /var/postgresql/data.93.5432/postmaster.pid
ssh dukkha.internal ln -s /var/postgresql/recovery.conf /var/postgresql/data.93.5432/recovery.conf
psql -c "select pg_stop_backup();" template1
ssh dukkha.internal sudo /usr/local/bin/svc -u /service/postgresql.5432


_postgresql@nirvana:/var/postgresql $ sh -x start_hot_standby.sh
+ date +%Y-%m-%d
+ backup_label=wykids_2013-11-06
+ ssh dukkha.internal rm -rf /wal/*
+ ssh dukkha.internal sudo /usr/local/bin/svc -d /service/postgresql.5432
+ rsync -e ssh /wal/ dukkha.internal:/wal/
skipping directory .
+ psql -c select pg_start_backup('wykids_2013-11-06'); template1
 pg_start_backup
-----------------
 0/3D000024
(1 row)

+ rsync --copy-links --delete --exclude=backup_label --exclude=postgresql.conf --exclude=recovery.done -e ssh -avz /var/postgresql/data.93.5432/ dukkha.internal:/var/postgresql/data.93.5432/
sending incremental file list
./
(snip rsync output)

sent 239,677,033 bytes  received 225,961 bytes  1,139,681.68 bytes/sec
total size is 1,426,541,526  speedup is 5.95
+ ssh dukkha.internal rm -f /var/postgresql/data.93.5432/pg_xlog/*
rm: /var/postgresql/data.93.5432/pg_xlog/archive_status: is a directory
+ ssh dukkha.internal rm -f /var/postgresql/data.93.5432/pg_xlog/archive_status/*
+ ssh dukkha.internal rm -f /var/postgresql/data.93.5432/pg_log/*
+ ssh dukkha.internal rm -f /var/postgresql/data.93.5432/postmaster.pid
+ ssh dukkha.internal ln -s /var/postgresql/recovery.conf /var/postgresql/data.93.5432/recovery.conf
+ psql -c select pg_stop_backup(); template1
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
 pg_stop_backup
----------------
 0/3D0000DC
(1 row)

+ ssh dukkha.internal sudo /usr/local/bin/svc -u /service/postgresql.5432

The standby server fails to start with the following:

2013-11-06 07:50:23.436056500 <%> LOG: database system was interrupted; last known up at 2013-11-06 07:47:07 MST
2013-11-06 07:50:23.495849500 <%> LOG:  entering standby mode
2013-11-06 07:50:23.495854500 <%> LOG:  invalid primary checkpoint record
2013-11-06 07:50:23.495858500 <%> LOG:  invalid secondary checkpoint record
2013-11-06 07:50:23.495862500 <%> PANIC: could not locate a valid checkpoint record

I'm sure I'm missing something simple here, but I'm not seeing what it is.

Thanks!

Jeff Ross


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux