On Fri, 2007-03-30 at 12:40 -0400, Tom Lane wrote: > "Simon Riggs" <simon@xxxxxxxxxxxxxxx> writes: > > I think there is a problem here. If we stop before the end of logs we > > should be incrementing the timeline id. > > There is no good reason here to think that we have stopped before the > end of logs, and I don't think I want the code bumping the timeline ID > on every crash restart. The timeline is a protection against confusing ourselves when we have two log files both called the same thing. In the OP's case, there were clearly unapplied log files that end up as duplicates. I can see the difficulty in knowing whether or not to bump the timeline id. At very least we need to document this, since if the manual's advice were taken "The archive command should generally be designed to refuse to overwrite any pre-existing archive file." then the OP's system would start throwing errors when the first xlog fills after the recovered system re-enters normal operation. We should say: "During recovery it is possible, if you're unlucky, that one of the WAL files has been damaged. If so, recovery will stop at the point at which the damage has occurred. It is probable that WAL files higher than the damaged WAL file exist in the archive. If that is the case, you may need to begin archiving to a different location, or move the earlier WAL files out of the archive, to allow the newly restored server to continue archive operations correctly. If you don't, the server will operate normally but further archiving may not occur correctly. Take good care of your archived WAL files or better still take two copies.". -- Simon Riggs EnterpriseDB http://www.enterprisedb.com