Re: [COMMITTERS] pgsql: Allow a streaming replication standby to follow a timeline switc

hubert depesz lubaczewski <depesz@xxxxxxxxxx> · Tue, 18 Dec 2012 12:42:16 +0100

On Mon, Dec 17, 2012 at 02:01:20PM +0200, Heikki Linnakangas wrote:
> Hmm, is it possible that some WAL was generated in the old master,
> and streamed to the standby, after the new master was already
> promoted? It's important to kill the old master before promoting the
> new master. Otherwise the timelines diverge, so that you have some
> WAL on the old timeline that's not present in the new master, and
> some WAL in the new master's timeline that's not present in the old
> master. In that situation, if the standby has already replicated the
> WAL from the old master, it can no longer start to follow the new
> master. I think that would match the symptoms you're seeing.

OK. Checked, and can't get it work.

Situation:
3 hosts:
ubuntu1 (172.28.173.140)
ubuntu2 (172.28.173.141)
ubuntu3 (172.28.173.142)
on all I have 9.3head compiled.
Also I have floating IP (172.28.173.253) which is aliased on ubuntu1.

On ubuntu1 I make simple Pg initdb, start backup, copy data to ubuntu2
and ubuntu3, and stop backup.

than, on both ubuntu2 and 3, I do:
rm data/pg_xlog/0* data/pg_xlog/archive_status/* data/pg_log/* data/postmaster.pid ; cp ~/recovery.conf ~/data/

recovery.conf is:

standby_mode = 'on'
primary_conninfo = 'port=5920 user=replication host=172.28.173.253'
trigger_file = '/tmp/finish.replication'
recovery_target_timeline='latest'

relevant settings in master conf:
-----------------
=$ grep -i -E '(wal|checkpoint|archive)' data/postgresql.conf 
# RESOURCE USAGE (except WAL)
wal_level = hot_standby
#wal_sync_method = fsync                # the default is the first option
#wal_buffers = -1                       # min 32kB, -1 sets based on shared_buffers
#wal_writer_delay = 200ms               # 1-10000 milliseconds
# - Checkpoints -
#checkpoint_segments = 3                # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 5min              # range 30s-1h
#checkpoint_completion_target = 0.5     # checkpoint target duration, 0.0 - 1.0
#checkpoint_warning = 30s               # 0 disables
archive_mode = on
archive_command = '/bin/true'
                                # placeholders: %p = path of file to archive
                                # e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
#archive_timeout = 0            # force a logfile segment switch after this
max_wal_senders = 3            # max number of walsender processes
wal_keep_segments = 20          # in logfile segments, 16MB each; 0 disables
#wal_sender_timeout = 60s       # in milliseconds; 0 disables
#max_standby_archive_delay = 30s        # max delay before canceling queries
                                        # when reading WAL from archive;
                                        # when reading streaming WAL;
#wal_receiver_status_interval = 10s     # send replies at least this often
#wal_receiver_timeout = 60s             # time that receiver waits for
log_checkpoints = on
-----------------

Afterwards I do:

on ubuntu1:
pg_ctl -m fast stop
sudo ifdown eth0:0

eth0:0 is the interface with shared ip.

all works fine.

then on ubuntu2 I do:
touch /tmp/finish.replication
sudo ifup eth0:0

result:
ubuntu2 is working OK, but ubuntu3 logs:
2012-12-18 12:33:34.037 CET @ 1513  LOG:  database system was interrupted; last known up at 2012-12-18 12:32:47 CET
2012-12-18 12:33:34.037 CET @ 1513  LOG:  entering standby mode
2012-12-18 12:33:49.075 CET @ 1514  LOG:  started streaming WAL from primary at 0/14000000 on timeline 1
2012-12-18 12:33:49.279 CET @ 1513  LOG:  redo starts at 0/14000028
2012-12-18 12:33:49.282 CET @ 1513  LOG:  consistent recovery state reached at 0/140000E8
2012-12-18 12:33:49.282 CET @ 1508  LOG:  database system is ready to accept read only connections
2012-12-18 12:34:14.007 CET @ 1514  LOG:  replication terminated by primary server
2012-12-18 12:34:14.007 CET @ 1514  DETAIL:  End of WAL reached on timeline 1
2012-12-18 12:34:14.014 CET @ 1514  FATAL:  could not send data to WAL stream: server closed the connection unexpectedly
                This probably means the server terminated abnormally
                before or while processing the request.

2012-12-18 12:34:14.014 CET @ 1513  LOG:  record with zero length at 0/15000088
2012-12-18 12:34:44.066 CET @ 1598  LOG:  fetching timeline history file for timeline 2 from primary server
2012-12-18 12:34:44.071 CET @ 1598  LOG:  started streaming WAL from primary at 0/15000000 on timeline 1
2012-12-18 12:34:44.071 CET @ 1598  LOG:  replication terminated by primary server
2012-12-18 12:34:44.071 CET @ 1598  DETAIL:  End of WAL reached on timeline 1
2012-12-18 12:34:44.071 CET @ 1598  FATAL:  error reading result of streaming command: ERROR:  requested WAL segment 000000010000000000000015 has already been removed

2012-12-18 12:34:44.072 CET @ 1513  LOG:  new target timeline is 2
2012-12-18 12:34:44.078 CET @ 1612  LOG:  started streaming WAL from primary at 0/15000000 on timeline 2
2012-12-18 12:34:44.078 CET @ 1612  LOG:  replication terminated by primary server
2012-12-18 12:34:44.078 CET @ 1612  DETAIL:  End of WAL reached on timeline 2
2012-12-18 12:34:44.078 CET @ 1612  FATAL:  error reading result of streaming command: ERROR:  requested WAL segment 000000020000000000000015 has already been removed
...

and so on.

Looks like it did switch timeline, but the new slave doesn't have some bits in WAL.
Which is weird, because, on ubuntu2:
$ select pg_current_xlog_location();
 pg_current_xlog_location 
──────────────────────────
 0/150000E8
(1 row)

On ubuntu2 in pg_xlog:

12:41:05 pgdba@ubuntu2 ~/data/pg_xlog
=$ ls -la
total 49168
drwx------  3 pgdba pgdba     4096 Dec 18 12:34 ./
drwx------ 15 pgdba pgdba     4096 Dec 18 12:34 ../
-rw-------  1 pgdba pgdba 16777216 Dec 18 12:34 000000020000000000000016
-rw-------  1 pgdba pgdba 16777216 Dec 18 12:34 000000020000000000000017
-rw-------  1 pgdba pgdba 16777216 Dec 18 12:33 000000020000000000000018
-rw-------  1 pgdba pgdba       42 Dec 18 12:34 00000002.history
drwx------  2 pgdba pgdba     4096 Dec 18 12:34 archive_status/

12:41:06 pgdba@ubuntu2 ~/data/pg_xlog
=$ cat 00000002.history 
1       0/15000088      no recovery target specified

In pg_log on ubuntu2 I see:

2012-12-18 12:41:34.428 CET [unknown]@[unknown] 1685  LOG:  connection received: host=172.28.173.142 port=45842
2012-12-18 12:41:34.430 CET replication@[unknown] 1685 172.28.173.142(45842) LOG:  replication connection authorized: user=replication
2012-12-18 12:41:34.432 CET replication@[unknown] 1685 172.28.173.142(45842) ERROR:  requested WAL segment 000000020000000000000015 has already been removed
2012-12-18 12:41:34.433 CET replication@[unknown] 1685 172.28.173.142(45842) LOG:  disconnection: session time: 0:00:00.005 user=replication database= host=172.28.173.142 port=45842

Something looks weird. To put it lightly.

Best regards,

depesz

-- 
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general