Greetings, * Joao Junior (jcoj2006@xxxxxxxxx) wrote: > I am running 2 Linux machines, kernel 3.13.0-45-generic #74-Ubuntu SMP. > Postgresql version 9.4 in both machine, in a Hot Standby cenario. > > Master-Slave using WAL files, not streaming replication. > > The archive_command from master is: > > archive_command = '/usr/bin/rsync -a -e "ssh" "%p" > slave:/data2/postgres/standby/main/incoming/"%f"' # Admittedly, you're talking about this for only WAL replay, but I wanted to point out that this is not a safe archive_command setting because you are not forcing an fsync() on the remote side to be sure that the WAL is saved out to disk before telling PG that it can remove that WAL segment. What that means is that a crash on the remote system at just the wrong time will create a hole in your WAL, possibly meaning that you wouldn't be able to replay WAL on the remote side past that point and would have to rebuild/re-sync your replica. There's a lot more that needs to be done carefully and correctly to ensure proper backups than this also. In short, don't try to write your own software for doing this. Use one of the existing tools which do the correct things for you. In particular, when it comes to dealing with archive_command being a bottleneck, pgbackrest has an async mode where it will keep open a long-running SSH session to the remote side to avoid the overhead associated with re-exec'ing SSH. The ControlMaster approach helps too, but you still have to exec rsync and ssh and it's ultimately expensive to even do that. Unfortunatly, restore_command is single threaded and is going to be pretty expensive too, though pgbackrest's archive-get tries to be as fast as it can be. pgbackrest also has the ability to perform incremental backups and incremental restores, in parallel, allowing you to periodically bring your replica up to speed very quickly using file transfers instead of using WAL replay. This does mean that you have a period of downtime for your replica, of course, but that might be a worthwhile trade-off. For pure backups, another approach is to use pg_receivexlog and a tool like barman which supports verifying that the WAL for a given backup has reached the remote side. Thanks! Stephen
Attachment:
signature.asc
Description: Digital signature