You need to make the script wait forever for the next archive file, or
return 1 only if you want to complete the recovery. Here's a stub of
what we do:
#!/usr/bin/ksh
set -e
# Where copy.sh fetches WALs from
ARCHDIR=/data3/archive_log
# Name of log file that copy.sh writes to
copy_log=copy.log
# Name of stop file that copy looks for to signal "go live"
copy_stop=copy.stop
# How many seconds sleep between copy iterations
seconds=15
# Expected size of WAL file
size=16777216
src=$1
dest=$2
base=$(basename $src)
exec >>$copy_log
exec 2>&1
echo $base
case $base in
*.history)
echo "\tignored"
exit 1
;;
*.backup)
size=+0
;;
esac
while :; do
if [ -f $copy_stop ]; then
echo "\tstop file"
rm $copy_stop
exit 1
fi
if [ -f "$src" ]; then
if [ "$(find $src -size ${size}c)" ]; then
echo "\tfound"
cp $src $dest
exit 0
else
echo "\ttoo small"
fi
fi
echo "\tsleeping $seconds"
sleep $seconds
done
Kenji Morishige wrote:
I've got 2 identical servers configured exactly the same way, except for some
minor differences for the WAL logging directories. I have both machines set up
as a NFS server and client, so that the WAL archive gets written out to the
local filesystem of the backup machine depending on which role the machine is
currently configured for.
I've been able to get the backup server syncronized by using the recover.conf
file as described in the documenation, but I can't seem to write a generic shell
script that will keep the warm-backup in a continously syncronizing mode. It
always stops and renames the recover.conf to recover.done.
I've tried to write an alternate restore command as follows:
#!/usr/local/bin/bash
if [ -e /export/raid/pgsql/recovery.stop ]; then
exit 1
fi
if [ -e $1 ]; then
`/bin/cp $1 $2`
fi
sleep 5
exit 0
The documenation says that it should return 0 only if it is successfull. My
understanding is that the recovery script should continuously try to copy the
archived data to the WAL directory so that the WARM-BACKUP server can
syncronize. I'd like to have the WARM-BACKUP always be only a few minutes
behind in syncronization from the PRIMARY without human intervention. I can
write a cronjob to clean out the WAL archive directory accordingly.
I would be extremely gratefull for any assistance from anyone with a similar
configuration. I must be confused by how the restore_command is supposed to
work.
Sincerely,
Kenji
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at
http://www.postgresql.org/about/donate