It worked!I had even setup streaming replication after doing incremental replication, without needing to stop postgres on the primary server.Here is the script i came up with :#!/bin/bash
if [ $# -ne 1 ]; then
echo "you must specify the hostname of the backup server";
exit 0;
fi;
BACKUP_SERVER=$1
PGDATA=/var/lib/pgsql/data
PGXLOG=$PGDATA/pg_xlog
PGEXEC="sudo su -l postgres -s /bin/bash -c"
RSYNC="rsync"
OPTIONS="--archive --checksum --compress --progress"
EXCLUDES="--exclude postmaster.pid --exclude postgresql.conf --exclude pg_hba.conf --exclude server.crt --exclude server.key"
ROLLOVER=32
SSH="ssh -q -o StrictHostKeyChecking=no -o BatchMode=yes $BACKUP_SERVER"
REPLICATION_CHECK="$SSH ps aux | grep postgres | grep wal | grep receiver"
#On BACKUP_SERVER
if [ -n "$(service postgresql status | grep "pid[:blank:]*[0-9]*")" ]; then
$SSH "service postgresql stop"
fi;
#On PRIMARY
echo "Running VACUUM"
$PGEXEC "psql -c \"VACUUM FULL;\""
echo "VACUUM completed"
for f in $(ls -tr $PGXLOG | head -n ${ROLLOVER}); do
$RSYNC $OPTIONS $PGXLOG/$f $BACKUP_SERVER:$PGXLOG/
done;
$PGEXEC "psql -c \"SELECT pg_start_backup('incremental_backup',true);\""
$RSYNC $OPTIONS $EXCLUDES --exclude pg_xlog $PGDATA $BACKUP_SERVER:$PGDATA
$PGEXEC "psql -c \"SELECT pg_stop_backup();\""
$RSYNC $OPTIONS $PGXLOG $BACKUP_SERVER:$PGXLOG
$RSYNC $OPTIONS $PGXLOG $BACKUP_SERVER:$PGXLOG
#On BACKUP_SERVER
$SSH "service postgresql start"
if [ -z "$(service postgresql status | grep "pid[:blank:]*[0-9]*")" ]; then
echo "Failed to start database on backup server"
echo "Look into the postgres logs for more details"
echo "exiting..."
exit 1;
fi;
#need to improve this delay-check to wait until the backup server has finished recovery and started into streaming mode
sleep 30
if [ -n "$(${REPLICATION_CHECK})" ] ; then
echo "SUCCESS in synching BACKUP_SERVER with the latest data from Primary";
#On BACKUP_SERVER
$SSH "service postgresql stop"
echo "Stopped the backup server in good state; it will get updated in the next scheduled incremental backup"
else
echo "FAILED to sync backup server with Primary";
echo "Leaving the backup server running in the failed state for further debugging"
exit 1;
fi;
exit 0;
I hope this would help others in need...
Thanks and Regards,
Samba
----------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------------------
On Thu, May 3, 2012 at 11:55 PM, Michael Nolan <htfoot@xxxxxxxxx> wrote:
On Thu, May 3, 2012 at 11:49 AM, Samba <saasira@xxxxxxxxx> wrote:Hi,Please advise me if what i'm doing is makes sense and is an accepted mechanism for taking backups or if there is any other procedure that i can emplpoy to avoid unnecessarily archiving gigabytes of WAL logs which may be growing many times the size of the actual data directory.Thanks and Regards,
Samba
The problem is that rsync isn't copying all the xlog files created during the time the rsync is taking place, which is why it is complaining that there are files missing.
There may be other logical flaws with your process as well.
Something similar to the steps given in "Starting Replication with only a Quick Master Restart" as laid out in the wiki tutorial on binary replication might give you a way to make this work. (You probably won't need the restart of the master, since you're not actually setting up replication, so you won't be changing the postgresql.conf file on your master.)
This uses a two-step process. First you copy all the files EXCEPT the ones on pg_xlog, then you copy those files, so you have a complete set.
See http://wiki.postgresql.org/wiki/Binary_Replication_Tutorial
--
Mike Nolan