> | Do you see any problem in the current approach ? > | i have seen it working fine till now. > > I do, to be honest. The WAL location counter accounts for 4294967295 > positions and while I'm certain that's WAY more than the average number > of transactions that go into a WAL, quite a number of small ones can > certainly happen before a WAL is rolled over, and until then, you're > dealing with the same log file. > > If two backups happen in that period of time for whatever reason, you're > going to have a false positive by looking into ${WAL_ARCHIVE} and > searching just for the WAL name, so including the location in the search > of a WAL fragment is certainly necessary. Infact, going purely by > chance, the probability of hitting the same location in two different > log files in two subsequent backups is much lower than hitting the same > WAL twice. Dear Grega, sincere thanks for your time, The current wal log is not being removed from the wal archive area in any case. The files less than the current ones are being rm'ed. I am sorry i am not able to get your apprehension. But i shall surely try harder to understand your point. anyways have a look at the current script with following improvements. 1. Do some sanity checks about folder existance and permissions 2. accepts 3 mandatory args now , PGDATADIR , BACKUP DUMP FOLDER and WAL ARCHIVE AREA 3. use readlink -f to probe all the directories to be included in basebackup 4. Attempt to probe psql and rsync in system and bail out if not found. Regarding : > | 2. Frees disk space by removing unwanted LOG files in WAL_ARCHIVE_DIR > > Perhaps moving the old log files into a father backup directory and > having them stick around for a period of time before removing them isn't > a bad idea either, just in case something goes wrong with your latest > backup. You could go about that using find as well; see the -ctime > predicate in find(1). the old log files without the base backup are not useful. since rsync is being used to optimise the copying by overwriting the base backup everytime, i dont thing preserving the old files makes sense. Had it been and non overwritng backup the files would have made sense. ---------------- BEGIN ------------------------------------------------- #!/bin/bash ################################################## # it does following # 1. checks existance and permission of imp folders. # 2. takes base backup to a destined folder by rsync # 3. removes unwanted archived log files. ################################################## if [ $# -ne 3 ] then echo "Usage: $0 <DATADIR> <BACKUP DIRECTORY> <WAL ARCHIVE DIRECTORY>" exit 1 fi DATADIR_IN=$1 BACKUPFOLDER=$2 WAL_ARCHIVE=$3 if [ -z $BACKUPFOLDER ] || [ ! -d $BACKUPFOLDER ] || [ ! -w $BACKUPFOLDER ] then echo "Sorry base backup folder $BACKUPFOLDER does not exists or is not writable or is not specified!" exit 1 fi if [ -z $WAL_ARCHIVE ] || [ ! -d $WAL_ARCHIVE ] || [ ! -w $WAL_ARCHIVE ] then echo "Sorry WAL archive folder $WAL_ARCHIVE does not exists or is not writable or is not specified!" exit 1 fi if [ -L $DATADIR_IN ] then DATADIR=`readlink -f $DATADIR_IN` echo "Using $DATADIR instead of $DATADIR_IN as $DATADIR_IN is a link" else DATADIR=$DATADIR_IN fi # get all tablespaces from $DATADIR/pg_tblspc DIRS=(`find $DATADIR/pg_tblspc -type l -exec readlink -f {} \;`) # append DATADIR to it DIRS=( "${DIRS[@]}" $DATADIR) CTR=0 echo "Script shall backup following folders" while [ -n "${DIRS[${CTR}]}" ]; do echo "${DIRS[${CTR}]}" CTR=$((CTR + 1)) done unset CTR PSQL_BIN=`which psql` || /usr/local/pgsql/bin/psql RSYNC_BIN=`which rsync` || /usr/bin/rsync for PROG in $PSQL_BIN $RSYNC_BIN ; do if [ ! -f $PROG ] || [ ! -x $PROG ] then echo "Sorry $PROG does not exists or is not executable by you" echo "Please set env variable PATH to include psql and rsync" exit 1 else echo "Using $PROG" fi done RSYNC_OPTS="--delete-after -a --exclude pg_xlog" RSYNC="$RSYNC_BIN $RSYNC_OPTS" PSQL=$PSQL_BIN today=`date +%d-%m-%Y-%H-%M-%S` label=base_backup_${today} echo "Executing pg_start_backup with label $label in server ... " # get the checkpoint at which backup starts # the .backup files seems to be bearing this string in it. CP=`$PSQL -q -Upostgres -d template1 -c "SELECT pg_start_backup('$label');" -P tuples_only -P format=unaligned` RVAL=$? if [ $RVAL -ne 0 ] then echo "PSQL pg_start_backup failed:$CP" exit 1; fi echo "pg_start_backup executed successfully" # read the backup_label file in pgdatadir and get the name of start wal file # below is example content. #START WAL LOCATION: E/A9145E4 (file 000000010000000E0000000A) #CHECKPOINT LOCATION: E/A92939C #START TIME: 2006-04-01 14:36:48 IST #LABEL: base_backup_01-04-2006-14-36-45 BACKUP_LABEL=$DATADIR/backup_label # assuming pg_start_backup immediate puts backup_label in # pgdatadir on finish. START_LINE=`grep -i "START WAL LOCATION" $BACKUP_LABEL` # get the like containing START WAL LOCATION START_LINE=${START_LINE/#START*file /} # strip something like 'START WAL LOCATION: E/A9145E4 (file ' from begin. START_LINE=${START_LINE/%)/} # strip ')' from end. # REF_FILE_NUM is something like 000000010000000A00000068 REF_FILE_NUM=$START_LINE echo "Content of $BACKUP_LABEL" echo "------------- begin -----------" cat $BACKUP_LABEL echo "------------- end -----------" echo "Read Start Wal as : $REF_FILE_NUM" echo "RSYNC begins.." # rsync each of the folders to the backup folder. CTR=0 while [ -n "${DIRS[${CTR}]}" ]; do echo "Syncing ${DIRS[${CTR}]}..." echo "Executing:${RSYNC} ${DIRS[${CTR}]} ${BACKUPFOLDER}" time ${RSYNC} ${DIRS[${CTR}]} ${BACKUPFOLDER} RVAL=$? echo "Sync finished with exit status ${RVAL}" if [[ ${RVAL} -eq 0 || ${RVAL} -eq 23 ]]; then echo "Rsync success" else echo "Rsync failed" $PSQL -Upostgres template1 -c "SELECT pg_stop_backup();" exit 1 fi CTR=$((CTR + 1)) done unset CTR echo "Executing pg_stop_backup in server ... " $PSQL -Upostgres template1 -c "SELECT pg_stop_backup();" if [ $? -ne 0 ] then echo "PSQL pg_stop_backup failed" exit 1; fi echo "pg_stop_backup done successfully" echo "REF_FILE_NUM=$REF_FILE_NUM" # iterate list of files in the WAL_ARCHIVE folder for i in `ls -1 $WAL_ARCHIVE` ; do # $i is :000000010000000A0000005D.bz2 eg # get first 24 chars in filename FILE_NUM=${i:0:24} # compare if the number is less than the reference # here string comparison is being used. if [[ $FILE_NUM < $REF_FILE_NUM ]] then echo "$FILE_NUM [ $i ] removed" rm -f $WAL_ARCHIVE/$i else echo "$FILE_NUM [ $i ] not removed" fi done ------------------ END -----------------------------------------------------