Minimal streaming replication

Steve Crawford <scrawford@xxxxxxxxxxxxxxxxxxxx> · Mon, 25 Jun 2012 16:47:10 -0700

I'm attempting to set up minimal/simple replication with one master and 
one standby using the following pair of identical machines connected 
through through a 1-Gb switch:
3.2.0-25-generic #40-Ubuntu SMP Wed May 23 20:30:51 UTC 2012 x86_64 
x86_64 x86_64 GNU/Linux
PostgreSQL 9.1.4 on x86_64-unknown-linux-gnu, compiled by gcc 
(Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit

The documentation says "To use streaming replication, set up a 
file-based log-shipping standby server as described in Section 25.2...." 
however I'm not using any of the archive or restore commands but instead 
use pg_basebackup to do the initial copy in a script that at its core 
runs pg_basebackup then starts the standby server. So...

Given a sufficiently large wal_keep_segments on the master is this a 
reasonable approach?

Is there a disadvantage, other than disk-space required, to having 
wal_keep_segments set to a fairly large number, say 256 or 512?

Once replication was running I tried to stress/break it. I started 
pgbench with 100 clients and then simultaneously started a restore (12 
GB of tables plus associated indexes). It *seems* to work. I get 
appropriate results from test queries, and the master and standby 
monitoring queries seem reasonable (queries taken at different times - 
log locations won't match):

--Standby
select
    pg_last_xlog_receive_location(),
    pg_last_xlog_replay_location(),
    now()-pg_last_xact_replay_timestamp() as log_delay;
 pg_last_xlog_receive_location | pg_last_xlog_replay_location |    
log_delay
-------------------------------+------------------------------+-----------------
 1F2/F4E4F8C0                  | 1F2/F4E4F8C0                 | 
00:00:00.995516

--Master
select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
procpid          | 25945
usesysid         | 10
usename          | postgres
application_name | walreceiver
client_addr      | 192.168.4.215
client_hostname  |
client_port      | 41335
backend_start    | 2012-06-25 15:59:02.833441-07
state            | streaming
sent_location    | 1F3/659F2000
write_location   | 1F3/659D3538
flush_location   | 1F3/659D3538
replay_location  | 1F3/659C1570
sync_priority    | 0
sync_state       | async

However I'm seeing troubling messages in the log. While running pgbench 
I see the following types of messages on the master every minute or few:

2012-06-25 16:15:51 PDT WARNING:  pgstat wait timeout
2012-06-25 16:16:26 PDT LOG:  SSL renegotiation failure
2012-06-25 16:16:26 PDT LOG:  SSL error: unexpected record
2012-06-25 16:16:26 PDT LOG:  could not send data to client: Connection 
reset by peer

The standby has the following sorts of messages:
...
2012-06-25 11:12:11 PDT FATAL:  could not receive data from WAL stream: 
SSL error: sslv3 alert unexpected message
2012-06-25 11:12:11 PDT LOG:  record with zero length at 1C5/95D2FE00
2012-06-25 11:12:26 PDT LOG:  streaming replication successfully 
connected to primary
...
2012-06-25 11:30:59 PDT LOG:  unexpected pageaddr 1C7/C9FAE000 in log 
file 456, segment 173, offset 16441344
2012-06-25 11:30:59 PDT LOG:  streaming replication successfully 
connected to primary
...
2012-06-25 11:36:26 PDT FATAL:  could not send data to WAL stream: SSL 
error: sslv3 alert unexpected message
2012-06-25 11:36:26 PDT LOG:  invalid magic number 0000 in log file 457, 
segment 173, offset 15851520
...
2012-06-25 11:36:41 PDT LOG:  streaming replication successfully 
connected to primary
...

Any advice on what this is telling me? I'm not keen on words like 
"FATAL" in my logs.

Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general