On 11/6/13, 11:32 AM, Jeff Janes wrote:
Hi Jeff, Thanks for the reply. Oops, I copied one of the many changes to the script, but not the one with the rsync to copy /wal from the primary to the standby. I should have mentioned that wal archiving is setup and working from the primary to the standby. It saves wal both on the locally on the primary and remotesly on the standby. I moved the rsync line to copy wal from primary to secondary after pg_stop_backup but I'm still getting the same panic on the standby. Here's the real, honest version of the script I use to start the hot standby: _postgresql@nirvana:/var/postgresql $ cat start_hot_standby.sh #!/bin/sh backup_label=wykids_`date +%Y-%m-%d` #remove any existing wal files on the secondary ssh dukkha.internal "rm -rf /wal/*" ssh dukkha.internal sudo /usr/local/bin/svc -d /service/postgresql.5432 psql -c "select pg_start_backup('$backup_label');" template1 rsync \ --copy-links \ --delete \ --exclude=backup_label \ --exclude=postgresql.conf \ --exclude=recovery.done \ -e ssh -avz /var/postgresql/data.93.5432/ \ dukkha.internal:/var/postgresql/data.93.5432/ ssh dukkha.internal "rm -f /var/postgresql/data.93.5432/pg_xlog/*" ssh dukkha.internal "rm -f /var/postgresql/data.93.5432/pg_xlog/archive_status/*" ssh dukkha.internal "rm -f /var/postgresql/data.93.5432/pg_log/*" ssh dukkha.internal "rm -f /var/postgresql/data.93.5432/postmaster.pid" ssh dukkha.internal "ln -s /var/postgresql/recovery.conf /var/postgresql/data.93.5432/recovery.conf" psql -c "select pg_stop_backup();" template1 rsync -e ssh -avz /wal/ dukkha.internal:/wal/ ssh dukkha.internal sudo /usr/local/bin/svc -u /service/postgresql.5432 Here are the logs on the standby after running the above: 2013-11-06 11:56:30.792461500 <%> LOG: database system was interrupted; last known up at 2013-11-06 11:52:22 MST 2013-11-06 11:56:30.800685500 <%> LOG: entering standby mode 2013-11-06 11:56:30.800891500 <%> LOG: invalid primary checkpoint record 2013-11-06 11:56:30.800930500 <%> LOG: invalid secondary checkpoint record 2013-11-06 11:56:30.801004500 <%> PANIC: could not locate a valid checkpoint record Jeff |