I must say this is a bit absurd; I didn’t realize that telling someone not to delete Postgres WAL files from underneath Postgres would require me to provide a complete redundant backup solution. Pg_receivxlog or the archive_command being single threaded is not an issue; at least not for me and I’m generating 1/2TB of WALs a day. The real problem is that applying the WALs is single threaded — trying to apply a single days worth of WALs takes too long and is one of the reasons I take multiple backups a day to reduce the number of WALs required during a PITR. The solution you proposed; would not be able to keep up with the rate of backups I issue daily nor is it capable of taking a backups on the replica at least not yet from the presentation I reviewed. I do my backups on replicas at multiple sites with WAL files also being stored at multiple sites. I also do daily restores in a lower environment which take less than 5 minutes to do — obviously I’m making extensive use of snapshots and snapshot replication. If your archive server is crashing then you have other issues and one should work to remove single points of failure. I’m not sure what filesystem you’re using but the one I use sync to disk every 30 seconds. The original poster stated that rsync wasn’t even an option and it not even using it. I’m not here trying to push a backup solution or anything else; I was just trying give some simple advice to the given problem. |