Search Postgresql Archives

Re: streaming replication timeout error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/09/2013 05:51 PM, 高健 wrote:
Hello:

Thanks for replying.

The recovery.conf file on standby(DB2) is like that:

standby_mode             = 'on'
primary_conninfo         = 'host=DB1 port=5432 application_name=testpg
user=postgres connect_timeout=10 keepalives_idle=5 keepalives_interval=1'
recovery_target_timeline = 'latest'
restore_command          = 'scp -o "ConnectTimeout 5" -i
/opt/PostgresPlus/9.2AS/.ssh/id_edb
DB1:/opt/PostgresPlus/9.2AS/data/arch/%f %p'


I  am not familiar with the scp command,  I think that here scp is used
to copy archive wal log files from primary  to standby...

Maybe the ConnectionTimeout is too small, And sometimes when network is
not very well,
the restore_command will fail and return FATAL error?

In fact I am a little confused about restore_command, we are using
streaming replication, but why restore_command is still needed to copy
archive wal log, isn't it  the old warm standby (file shipping)?

Best explanation is in the docs:

http://www.postgresql.org/docs/9.3/static/warm-standby.html
"
At startup, the standby begins by restoring all WAL available in the archive location, calling restore_command. Once it reaches the end of WAL available there and restore_command fails, it tries to restore any WAL available in the pg_xlog directory. If that fails, and streaming replication has been configured, the standby tries to connect to the primary server and start streaming WAL from the last valid record found in archive or pg_xlog. If that fails or streaming replication is not configured, or if the connection is later disconnected, the standby goes back to step 1 and tries to restore the file from the archive again. This loop of retries from the archive, pg_xlog, and via streaming replication goes on until the server is stopped or failover is triggered by a trigger file.
"

Basically by having a restore_command and primary_conninfo you are telling the standby to do both, following the sequence described above.

FYI ConnectTimeout is a SSH option passed to scp.

man ssh_config will get you more information.

Would seem both your streaming and archiving are using the same network, is that correct?

If so you have a single point of failure, the network.



Best Regards
jian gao



--
Adrian Klaver
adrian.klaver@xxxxxxxxx


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux