Search Postgresql Archives

wal receiver stops for 2 hour

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all.

PostgeewSQL 11.4 on Centos 7.

I created a test bed on VirtualBox and test HA cluster by random failures in loop. Sometime, in case of longest switching from failure, I get strange behaviour. The walreceiver is stopped by timeout, but don't restarted for 2 hours. May be I agree with stopping by timeout, but why postgresql wait for 2 hour to start it again?

2019-08-12 16:34:31.118 MSK [1455] FATAL:  terminating walreceiver due to timeout
2019-08-12 16:34:31.119 MSK [1451] LOG:  record with incorrect prev-link DC7A2D84/100 at 0/D078A38
2019-08-12 18:34:50.222 MSK [14634] FATAL:  could not connect to the primary server: server closed the connection unexpectedly
                This probably means the server terminated abnormally
                before or while processing the request.
2019-08-12 18:34:50.234 MSK [8462] LOG:  fetching timeline history file for timeline 4 from primary server
2019-08-12 18:34:50.235 MSK [8462] LOG:  started streaming WAL from primary at 0/D000000 on timeline 3
2019-08-12 18:34:50.237 MSK [8462] LOG:  replication terminated by primary server
2019-08-12 18:34:50.237 MSK [8462] DETAIL:  End of WAL reached on timeline 3 at 0/D078A38.
2019-08-12 18:34:50.238 MSK [1451] LOG:  new target timeline is 4
2019-08-12 18:34:50.239 MSK [8462] LOG:  restarted WAL streaming at 0/D000000 on timeline 4

May be reason is in restart_after_crash=off option (recommended fo HA clusters).

The postgresql config is default, except:

ident_file = '/var/lib/pgsql/pg_ident.conf'
hba_file = '/var/lib/pgsql/pg_hba.conf'
listen_addresses = '*'
log_filename = 'postgresql.%F.log'      # log file name pattern,
wal_keep_segments = 1
restart_after_crash = off

shared_buffers = 256MB
# may be this is a good compromise
synchronous_commit = remote_write
# other DC is the first, our is the last (this is different for each node)
synchronous_standby_names = 'FIRST 1 (tuchanka2a,tuchanka2c,tuchanka2b)'

And for slaves additionally:

primary_conninfo = 'host=krogan2 user=replicant application_name=tuchanka2d sslmode=disable'
recovery_target_timeline = 'latest'
standby_mode = 'on'





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux