Hi Lee, just got your reply. Every segment comes over compressed (gzip). So every segment would be a Different size in the compressed folder. But we decompress it into another folder (gzip) and they always decompress into the standard 16 meg size when We copy them back into xlog. So 0000000100000C28000000B1 came into xlog as 16777216, just like the others. Thanks for the response. Mark Steben│Database Administrator│ @utoRevenue-R- "Join the Revenue-tion" 95 Ashley Ave. West Springfield, MA., 01089 413-243-4800 x1512 (Phone) │ 413-732-1824 (Fax) @utoRevenue is a registered trademark and a division of Dominion Enterprises -----Original Message----- From: Lee Azzarello [mailto:lee@xxxxxxxxxx] Sent: Wednesday, February 25, 2009 10:40 AM To: pgsql-admin@xxxxxxxxxxxxxx Subject: Re: recovery question Is 0000000100000C28000000B1 the same size as the other segments? -lee 2009/2/25 Mark Steben <msteben@xxxxxxxxxxxxxxx>: > Hi listers, > > > > Here is my problem. I am running PITR restore on a machine remote from my > production machine. > > I'm shipping logs over there, compressed, then uncompressing them and > copying them to pg_xlog. > > Everything works fine until a network outage creates a gap in my logs. > > The recovery terminates at log "0000000100000C28000000B1" and brings the > database up > > Because it can't find "0000000100000C28000000B2". > > Log "0000000100000C28000000B3" is copied over but I wish to restart recovery > at B2. > > So I scp B2 over from my primary machine from a folder that I created for > just such an occasion. > > > > Now I rename recovery.done to recovery.conf (Copied here for your > convenience) > > > > 'sh /usr/local/postgresql-8.2.5/bin/copy.sh %f %p 2>>/tmp/recovery.log' > > > > (and copy.sh:) > > > > REQ_FILE=$1 > > DEST=$2 > > LF="${REQ_FILE}.lock" > > SUFFIX=${REQ_FILE##*.} > > ############################################################### > > ## check if file is transaction log or informational file > > ## if transaction log, cat from archlog and uncompress into unzipped folder > > ## if informational simply copy into unzipped folder (it came over > uncompressed) > > ############################################################################ ######### > > if [ "${SUFFIX}" != 'history' ] && [ "${SUFFIX}" != 'backup' ]; then > > cat "/logs/var/backups/archlog/${REQ_FILE}" | gzip -dc > > "/logs/var/backups/unzipped/${REQ_FILE}" > > if [ "$?" = "0" ] ; > > then > > echo 'successful uncompress of ' > "/logs/var/backups/unzipped/${REQ_FILE}" >> /tmp/restore.mavmail.log > > else > > echo 'unsuccessful uncompress of ' > "/logs/var/backups/unzipped/${REQ_FILE}" >> /tmp/restore.mavmail.log > > echo 'the return code is ' "$?" >> /tmp/restore.mavmail.log > > fi > > else > > cp "/logs/var/backups/archlog/${REQ_FILE}" > "/logs/var/backups/unzipped/${REQ_FILE}" > > fi > > ############################################################################ ########### > > ## check for size. If not a full size (16777216) trans log, the copy from > > ## cobra is still in progress. Don't copy this file. Stop recovery here. > > ############################################################################ ########### > > SIZE=$(ls -gG1 "/logs/var/backups/unzipped/${REQ_FILE}" | awk '{ print $3}' > ) > > echo "The size of the log to be restored is " "${SIZE}" >> > /tmp/restore.mavmail.log > > if [ "${SUFFIX}" != 'history' ] && [ "${SUFFIX}" != 'backup' ]; then > > if [ "${SIZE}" != '16777216' ]; then > > echo 'partially written log - not restored - finishing recovery' >> > /tmp/restore.mavmail.log > > exit 0 > > fi > > fi > > > > /usr/bin/lockfile "${LF}" > > ################################################################ > > ## copy either full sized trans log or informational file > > ## into pg_xlog data cluster. > > ################################################################ > > cp "/logs/var/backups/unzipped/${REQ_FILE}" "${DEST}" > > rm -f "${LF}" > > rm "/logs/var/backups/unzipped/${REQ_FILE}" > > > > (END) > > > > Now when I try to restart, hoping to begin recovery with the C2 log I get an > invalid checkpoint error: > > > > : LOG: starting archive recovery > > Feb 25 10:08:10 ar-db3 postgres[32538]: [3-1] @: LOG: restore_command = "sh > /usr/local/postgresql-8.2.5/bin/copy.sh %f %p 2>>/tmp/recovery.log" > > Feb 25 10:08:11 ar-db3 postgres[32538]: [4-1] @: LOG: restored log file > "0000000100000C28000000B1" from archive > > Feb 25 10:08:11 ar-db3 postgres[32538]: [5-1] @: LOG: invalid record length > at C28/B1FFECA4 > > Feb 25 10:08:11 ar-db3 postgres[32538]: [6-1] @: LOG: invalid primary > checkpoint record > > Feb 25 10:08:12 ar-db3 postgres[32538]: [7-1] @: LOG: restored log file > "0000000100000C28000000B1" from archive > > Feb 25 10:08:12 ar-db3 postgres[32538]: [8-1] @: LOG: invalid record length > at C28/B1FFEC5C > > Feb 25 10:08:12 ar-db3 postgres[32538]: [9-1] @: LOG: invalid secondary > checkpoint record > > Feb 25 10:08:12 ar-db3 postgres[32538]: [10-1] @: PANIC: could not locate a > valid checkpoint record > > Feb 25 10:08:12 ar-db3 postgres[32537]: [1-1] @: LOG: startup process (PID > 32538) was terminated by signal 6 > > Feb 25 10:08:12 ar-db3 postgres[32537]: [2-1] @: LOG: aborting startup due > to startup process failure > > > > I remove the recovery.conf file, successfully start the database and issue a > checkpoint. I try the restore again and get the same error. > > > > So, is there a way that I can force the recovery to begin at B2 or am I dead > in the water and need to bring in another full file copy and > > Start from scratch: > > > > Thanks for your time. > > > > Mark Steben│Database Administrator│ > > @utoRevenue-(R)- "Join the Revenue-tion" > 95 Ashley Ave. West Springfield, MA., 01089 > 413-243-4800 x1512 (Phone) │ 413-732-1824 (Fax) > > @utoRevenue is a registered trademark and a division of Dominion Enterprises > > > > -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin