Is 0000000100000C28000000B1 the same size as the other segments? -lee 2009/2/25 Mark Steben <msteben@xxxxxxxxxxxxxxx>: > Hi listers, > > > > Here is my problem. I am running PITR restore on a machine remote from my > production machine. > > I'm shipping logs over there, compressed, then uncompressing them and > copying them to pg_xlog. > > Everything works fine until a network outage creates a gap in my logs. > > The recovery terminates at log "0000000100000C28000000B1" and brings the > database up > > Because it can't find "0000000100000C28000000B2". > > Log "0000000100000C28000000B3" is copied over but I wish to restart recovery > at B2. > > So I scp B2 over from my primary machine from a folder that I created for > just such an occasion. > > > > Now I rename recovery.done to recovery.conf (Copied here for your > convenience) > > > > 'sh /usr/local/postgresql-8.2.5/bin/copy.sh %f %p 2>>/tmp/recovery.log' > > > > (and copy.sh:) > > > > REQ_FILE=$1 > > DEST=$2 > > LF="${REQ_FILE}.lock" > > SUFFIX=${REQ_FILE##*.} > > ############################################################### > > ## check if file is transaction log or informational file > > ## if transaction log, cat from archlog and uncompress into unzipped folder > > ## if informational simply copy into unzipped folder (it came over > uncompressed) > > ##################################################################################### > > if [ "${SUFFIX}" != 'history' ] && [ "${SUFFIX}" != 'backup' ]; then > > cat "/logs/var/backups/archlog/${REQ_FILE}" | gzip -dc > > "/logs/var/backups/unzipped/${REQ_FILE}" > > if [ "$?" = "0" ] ; > > then > > echo 'successful uncompress of ' > "/logs/var/backups/unzipped/${REQ_FILE}" >> /tmp/restore.mavmail.log > > else > > echo 'unsuccessful uncompress of ' > "/logs/var/backups/unzipped/${REQ_FILE}" >> /tmp/restore.mavmail.log > > echo 'the return code is ' "$?" >> /tmp/restore.mavmail.log > > fi > > else > > cp "/logs/var/backups/archlog/${REQ_FILE}" > "/logs/var/backups/unzipped/${REQ_FILE}" > > fi > > ####################################################################################### > > ## check for size. If not a full size (16777216) trans log, the copy from > > ## cobra is still in progress. Don't copy this file. Stop recovery here. > > ####################################################################################### > > SIZE=$(ls -gG1 "/logs/var/backups/unzipped/${REQ_FILE}" | awk '{ print $3}' > ) > > echo "The size of the log to be restored is " "${SIZE}" >> > /tmp/restore.mavmail.log > > if [ "${SUFFIX}" != 'history' ] && [ "${SUFFIX}" != 'backup' ]; then > > if [ "${SIZE}" != '16777216' ]; then > > echo 'partially written log - not restored - finishing recovery' >> > /tmp/restore.mavmail.log > > exit 0 > > fi > > fi > > > > /usr/bin/lockfile "${LF}" > > ################################################################ > > ## copy either full sized trans log or informational file > > ## into pg_xlog data cluster. > > ################################################################ > > cp "/logs/var/backups/unzipped/${REQ_FILE}" "${DEST}" > > rm -f "${LF}" > > rm "/logs/var/backups/unzipped/${REQ_FILE}" > > > > (END) > > > > Now when I try to restart, hoping to begin recovery with the C2 log I get an > invalid checkpoint error: > > > > : LOG: starting archive recovery > > Feb 25 10:08:10 ar-db3 postgres[32538]: [3-1] @: LOG: restore_command = "sh > /usr/local/postgresql-8.2.5/bin/copy.sh %f %p 2>>/tmp/recovery.log" > > Feb 25 10:08:11 ar-db3 postgres[32538]: [4-1] @: LOG: restored log file > "0000000100000C28000000B1" from archive > > Feb 25 10:08:11 ar-db3 postgres[32538]: [5-1] @: LOG: invalid record length > at C28/B1FFECA4 > > Feb 25 10:08:11 ar-db3 postgres[32538]: [6-1] @: LOG: invalid primary > checkpoint record > > Feb 25 10:08:12 ar-db3 postgres[32538]: [7-1] @: LOG: restored log file > "0000000100000C28000000B1" from archive > > Feb 25 10:08:12 ar-db3 postgres[32538]: [8-1] @: LOG: invalid record length > at C28/B1FFEC5C > > Feb 25 10:08:12 ar-db3 postgres[32538]: [9-1] @: LOG: invalid secondary > checkpoint record > > Feb 25 10:08:12 ar-db3 postgres[32538]: [10-1] @: PANIC: could not locate a > valid checkpoint record > > Feb 25 10:08:12 ar-db3 postgres[32537]: [1-1] @: LOG: startup process (PID > 32538) was terminated by signal 6 > > Feb 25 10:08:12 ar-db3 postgres[32537]: [2-1] @: LOG: aborting startup due > to startup process failure > > > > I remove the recovery.conf file, successfully start the database and issue a > checkpoint. I try the restore again and get the same error. > > > > So, is there a way that I can force the recovery to begin at B2 or am I dead > in the water and need to bring in another full file copy and > > Start from scratch: > > > > Thanks for your time. > > > > Mark Steben│Database Administrator│ > > @utoRevenue-(R)- "Join the Revenue-tion" > 95 Ashley Ave. West Springfield, MA., 01089 > 413-243-4800 x1512 (Phone) │ 413-732-1824 (Fax) > > @utoRevenue is a registered trademark and a division of Dominion Enterprises > > > > -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin