Hi all I have a question about sync streaming replication. I have 2 postgresql 9.1 servers set up with streaming replication. On the master node the slave is configured as a synchronous standby. I've verified that pg_stat_replication shows sync_state = sync for the slave node. It all seems to work fine. But I have noticed that sometimes when I restore backups created by pg_dump. The slave node will disconnect with the message in the postgresql log: 2013-06-03 13:13:48 GMT 4271 FATAL: could not receive data from WAL stream: SSL connection has been closed unexpectedly 2013-06-03 13:13:53 GMT 4270 LOG: invalid magic number 0000 in log file 15, segment 65, offset 11665408 2013-06-03 13:13:54 GMT 36428 LOG: streaming replication successfully connected to primary 2013-06-03 13:13:54 GMT 36428 FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000000F00000041 has already been removed 2013-06-03 13:13:58 GMT 36458 LOG: streaming replication successfully connected to primary 2013-06-03 13:13:58 GMT 36458 FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000000F00000041 has already been removed On the master I get this in the log file in the same timespan: 2013-06-03 13:13:47 GMT 1471 LOG: checkpoints are occurring too frequently (2 seconds apart) 2013-06-03 13:13:47 GMT 1471 HINT: Consider increasing the configuration parameter "checkpoint_segments". 2013-06-03 13:13:48 GMT 6189 [unknown] FATAL: requested WAL segment 000000010000000F00000041 has already been removed 2013-06-03 13:13:48 GMT 6189 [unknown] LOG: disconnection: session time: 77:37:37.684 user=root database= host=10.216.80.38 port=56114 2013-06-03 13:13:49 GMT 1471 LOG: checkpoints are occurring too frequently (2 seconds apart) 2013-06-03 13:13:49 GMT 1471 HINT: Consider increasing the configuration parameter "checkpoint_segments". 2013-06-03 13:13:51 GMT 1471 LOG: checkpoints are occurring too frequently (2 seconds apart) 2013-06-03 13:13:51 GMT 1471 HINT: Consider increasing the configuration parameter "checkpoint_segments". 2013-06-03 13:13:51 GMT 1468 LOG: received SIGHUP, reloading configuration files 2013-06-03 13:13:51 GMT 1468 LOG: parameter "synchronous_standby_names" removed from configuration file, reset to default 2013-06-03 13:13:53 GMT 1471 LOG: checkpoints are occurring too frequently (2 seconds apart) 2013-06-03 13:13:53 GMT 1471 HINT: Consider increasing the configuration parameter "checkpoint_segments". 2013-06-03 13:13:53 GMT 44063 [unknown] LOG: connection received: host=10.216.80.38 port=34038 2013-06-03 13:13:54 GMT 44063 [unknown] LOG: replication connection authorized: user=root 2013-06-03 13:13:54 GMT 44063 [unknown] FATAL: requested WAL segment 000000010000000F00000041 has already been removed 2013-06-03 13:13:54 GMT 44063 [unknown] LOG: disconnection: session time: 0:00:00.090 user=root database= host=10.216.80.38 port=34038 What I don't understand is how the slave node can miss a WAL segment since it should be sync? Shouldn't sync prevent the server from continuing if the slave is not able to get WAL segments fast enough? I have only noticed it while restoring a database. But the general load on the DB has not been that high, so I'm not sure if it can occur with other workloads. Best regards, Mads -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general