I'm attempting to set up minimal/simple replication with one master and
one standby using the following pair of identical machines connected
through through a 1-Gb switch:
3.2.0-25-generic #40-Ubuntu SMP Wed May 23 20:30:51 UTC 2012 x86_64
x86_64 x86_64 GNU/Linux
PostgreSQL 9.1.4 on x86_64-unknown-linux-gnu, compiled by gcc
(Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit
The documentation says "To use streaming replication, set up a
file-based log-shipping standby server as described in Section 25.2...."
however I'm not using any of the archive or restore commands but instead
use pg_basebackup to do the initial copy in a script that at its core
runs pg_basebackup then starts the standby server. So...
Given a sufficiently large wal_keep_segments on the master is this a
reasonable approach?
Is there a disadvantage, other than disk-space required, to having
wal_keep_segments set to a fairly large number, say 256 or 512?
Once replication was running I tried to stress/break it. I started
pgbench with 100 clients and then simultaneously started a restore (12
GB of tables plus associated indexes). It *seems* to work. I get
appropriate results from test queries, and the master and standby
monitoring queries seem reasonable (queries taken at different times -
log locations won't match):
--Standby
select
pg_last_xlog_receive_location(),
pg_last_xlog_replay_location(),
now()-pg_last_xact_replay_timestamp() as log_delay;
pg_last_xlog_receive_location | pg_last_xlog_replay_location |
log_delay
-------------------------------+------------------------------+-----------------
1F2/F4E4F8C0 | 1F2/F4E4F8C0 |
00:00:00.995516
--Master
select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
procpid | 25945
usesysid | 10
usename | postgres
application_name | walreceiver
client_addr | 192.168.4.215
client_hostname |
client_port | 41335
backend_start | 2012-06-25 15:59:02.833441-07
state | streaming
sent_location | 1F3/659F2000
write_location | 1F3/659D3538
flush_location | 1F3/659D3538
replay_location | 1F3/659C1570
sync_priority | 0
sync_state | async
However I'm seeing troubling messages in the log. While running pgbench
I see the following types of messages on the master every minute or few:
2012-06-25 16:15:51 PDT WARNING: pgstat wait timeout
2012-06-25 16:16:26 PDT LOG: SSL renegotiation failure
2012-06-25 16:16:26 PDT LOG: SSL error: unexpected record
2012-06-25 16:16:26 PDT LOG: could not send data to client: Connection
reset by peer
The standby has the following sorts of messages:
...
2012-06-25 11:12:11 PDT FATAL: could not receive data from WAL stream:
SSL error: sslv3 alert unexpected message
2012-06-25 11:12:11 PDT LOG: record with zero length at 1C5/95D2FE00
2012-06-25 11:12:26 PDT LOG: streaming replication successfully
connected to primary
...
2012-06-25 11:30:59 PDT LOG: unexpected pageaddr 1C7/C9FAE000 in log
file 456, segment 173, offset 16441344
2012-06-25 11:30:59 PDT LOG: streaming replication successfully
connected to primary
...
2012-06-25 11:36:26 PDT FATAL: could not send data to WAL stream: SSL
error: sslv3 alert unexpected message
2012-06-25 11:36:26 PDT LOG: invalid magic number 0000 in log file 457,
segment 173, offset 15851520
...
2012-06-25 11:36:41 PDT LOG: streaming replication successfully
connected to primary
...
Any advice on what this is telling me? I'm not keen on words like
"FATAL" in my logs.
Cheers,
Steve
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general