On 3 March 2016 at 07:54, cchee-ob <carter.chee@xxxxxxxxxxxxxxxx> wrote:
I queried pg_replication_slots after I removed an BDR node and I noticed a
slot_name that isn't in bdr.bdr_node_slots. And active is 'f' and it has
been retaining bytes. Should I be concerned and is there a way to remove
it. I do still have one UDR node which is running
(bdr_16385_6228994276814368133_1_16384). Any suggestions?
svp2=# SELECT
slot_name, database, active,
pg_xlog_location_diff(pg_current_xlog_insert_location(), restart_lsn)
AS retained_bytes
FROM pg_replication_slots
WHERE plugin = 'bdr';
slot_name | database | active |
retained_bytes
-----------------------------------------+----------+--------+----------------
bdr_16385_6206441431541275808_1_16385__ | svp2 | f |
410036551440
bdr_16385_6228994276814368133_1_16384__ | svp2 | t |
285760
(2 rows)
svp2=# SELECT * FROM pg_stat_replication;
-[ RECORD 1 ]----+-------------------------------------------
pid | 1122
usesysid | 10
usename | postgres
application_name | bdr (6228994276814368133,1,16384,):receive
client_addr | 10.253.0.8
client_hostname |
client_port | 43724
backend_start | 2016-02-25 17:53:21.10519-08
backend_xmin |
state | streaming
sent_location | 184/AE210BC8
write_location | 184/AE210BC8
flush_location | 184/AE20E748
replay_location | 184/AE210BC8
sync_priority | 0
sync_state | async
svp2=# select * from bdr.bdr_node_slots;
node_name | slot_name
-----------+-----------
(0 rows)
svp2=# SELECT * FROM bdr.bdr_nodes;
-[ RECORD 1 ]------+------------------------------------------
node_sysid | 6206439726032130602
node_timeline | 1
node_dboid | 16385
node_status | r
node_name | BDR1
node_local_dsn | host=10.253.228.105 port=5432 dbname=svp2
node_init_from_dsn |
-[ RECORD 2 ]------+------------------------------------------
node_sysid | 6206440469625465777
node_timeline | 1
node_dboid | 16385
node_status | k
node_name | BDR2
node_local_dsn | host=10.253.16.25 port=5432 dbname=svp2
node_init_from_dsn | host=10.253.228.105 port=5432 dbname=svp2
If you have a slot you know is unused, drop it. You can check it's the slot for the parted node by comparing the slot name against the bdr local node identity for the parted node (see the bdr docs for relevant functions to get node identity).
BDR makes a best-effort attempt at dropping slots when parting a node but there are known race conditions. We really need a two-phase part, where we first agree to part and *then* actually remove the node, but that's not yet implemented.