Hi all. I have a simple script for planned switchover of PostgreSQL (9.3 and 9.4) master to one of its replicas. This script checks a lot of things before doing it and one of them is that all data from master has been received by replica that is going to be promoted. Right now the check is done like below: On the master: postgres@pgtest03d ~ $ psql -t -A -c 'select pg_current_xlog_location();' 0/33000090 postgres@pgtest03d ~ $ /usr/pgsql-9.3/bin/pg_ctl stop -m fast waiting for server to shut down.... done server stopped postgres@pgtest03d ~ $ /usr/pgsql-9.3/bin/pg_controldata | head pg_control version number: 937 Catalog version number: 201306121 Database system identifier: 6061800518091528182 Database cluster state: shut down pg_control last modified: Mon 05 Jan 2015 06:47:57 PM MSK Latest checkpoint location: 0/34000028 Prior checkpoint location: 0/33000028 Latest checkpoint's REDO location: 0/34000028 Latest checkpoint's REDO WAL file: 0000001B0000000000000034 Latest checkpoint's TimeLineID: 27 postgres@pgtest03d ~ $ On the replica (after shutdown of master): postgres@pgtest03g ~ $ psql -t -A -c "select pg_xlog_location_diff(pg_last_xlog_replay_location(), '0/34000028');" 104 postgres@pgtest03g ~ $ These 104 bytes seems to be the size of shutdown checkpoint record (as I can understand from pg_xlogdump output). postgres@pgtest03g ~/9.3/data/pg_xlog $ /usr/pgsql-9.3/bin/pg_xlogdump -s 0/33000090 -t 27 rmgr: XLOG len (rec/tot): 0/ 32, tx: 0, lsn: 0/33000090, prev 0/33000028, bkp: 0000, desc: xlog switch rmgr: XLOG len (rec/tot): 72/ 104, tx: 0, lsn: 0/34000028, prev 0/33000090, bkp: 0000, desc: checkpoint: redo 0/34000028; tli 27; prev tli 27; fpw true; xid 0/6010; oid 54128; multi 1; offset 0; oldest xid 1799 in DB 1; oldest multi 1 in DB 1; oldest running xid 0; shutdown pg_xlogdump: FATAL: error in WAL record at 0/34000028: record with zero length at 0/34000090 postgres@pgtest03g ~/9.3/data/pg_xlog $ I’m not sure that these 104 bytes will always be 104 bytes to have a strict equality while checking. Could it change in the future? Or is there a better way to understand that streaming replica received all data after master shutdown? The check that pg_xlog_location_diff returns 104 bytes seems a bit strange. Thanks.
|