Steps taken that work:
1. Take a low-level (tar) backup on the live server
2. Restore the files on a second server (identical OS and PG versions)
3. Copy the archived WAL files to the backup server
4. Restore the archived WAL files on the backup server
However, why don't steps 5 through 8 work?
5. Continue to archive WAL files on the live server
6. Shut down PG on the backup server
7. Copy the additional archived WAL files to the backup server
8. Restore the additional WAL files on the backup server
At this point I get the following errors:
2007-05-04 15:36:37.623 CDT LOG: restored log file "000000010000008400000004" from archive
2007-05-04 15:36:
37.623 CDT LOG: invalid record length at 84/415CF9C
2007-05-04 15:36:37.623 CDT LOG: invalid primary checkpoint record
2007-05-04 15:36:37.781 CDT LOG: restored log file "000000010000008400000004" from archive
2007-05-04 15:36:37.782 CDT LOG: invalid record length at 84/415CF54
2007-05-04 15:36:37.782 CDT LOG: invalid secondary checkpoint record
2007-05-04 15:36:37.782 CDT PANIC: could not locate a valid checkpoint record
2007-05-04 15:36:37.782 CDT LOG: startup process (PID 22770) was terminated by signal 6
2007-05-04 15:36:37.782 CDT LOG: aborting startup due to startup process failure
There must be something I don't understand about how WAL files work. If the backup system
hasn't been touched since step 4 completed successfully, other than to shut down the PG
server, shouldn't it be possible to keep it in parallel by continuing to restore WAL files?
--
Mike Nolan
mnolan@xxxxxxxxxxx