I'm looking for advice on how to best switch from warm standby processing to "stand-in" production use in our (rather unusual) environment. THE ENVIRONMENT We have 72 counties spread around the state, each with their own database, which contains the official data for that county's court system. We do PITR backups to the file server in each county, and replicate the files (both base backups and WAL files) back to a single central machine which hosts a "farm" of warm standby PostgreSQL instances -- one for each county, all on the same machine. This serves to validate the integrity of the files brought back from the counties. Base backups are done weekly, and the WAN speed to a county depends primarily on the size of its base backup -- we try to get a speed sufficient to bring back the base backup within 24 hours. We force WAL files to be written at least once per hour, although many counties generate them much more frequently during the work day. It turns out that 72 warm standby instances can keep a machine pretty busy, so we won't want to use that machine for production. We have two machines ready to use for temporary "stand-ins" over the WAN in the event of a catastrophic failure of a county machine. We will get a database going on one of these, in production mode. This would only be done after recovery efforts on the county machine have failed, and that machine has been powered down. We will bring the warm standby machine as current as we can using WAL files, copy it to a stand-in machine, start our application middle tier and "top off" the stand-in database with transactions from the transaction repository our framework maintains (normally current to within a second). Once this stand-in is running, we will want to become the source of WAL files which can be applied to the warm standby on "the farm". THE QUESTION How do we create the PostgreSQL instance on the stand-in box? I see four possibilities: (1) Restore the latest base backup and apply all WAL files available. This is likely to be the slowest option. (2) Kick the warm standby on "the farm" into production mode, shut down the instance, and then copy the instance directory. This should be relatively quick and safe, but has the down side of needing to restart the warm standby from the latest base backup afterwards, if that is even possible. It seems like we might need to make a fresh base backup from the (stopped) instance on the warm standby farm. (3) Stop the warm standby while it is in recovery mode, copy the instance directory, and restart it. On the stand-in box, have the script specified by recovery.conf just "exit 1" to kick it immediately into production mode. As long as the WAL files generated in this situation work with the old warm standby, I don't see a down side, but I'm not sure if that is a safe assumption. (4) Capture information about where the warm standby is at and attempt a PITR-style copy of the instance while the warm stanby is running. This one seems riskier than (3) without significant benefit. So, I'm leaning toward option (3). Does that sound safe and workable? Will the WAL files from the stand-in work with the old warm standby? Did I miss any good alternatives? Thanks, -Kevin ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate