Hi Craig
I have a similar issue with a rather small number of servers involved.
Node 1:
max_replication_slots = 4
max_wal_senders = 6
wal_level = 'logical'
track_commit_timestamp = on
shared_preload_libraries = 'bdr'
max_worker_processes = 10
log_error_verbosity = verbose
log_min_messages = debug1
log_line_prefix = 'd=%d p=%p a=%a%q '
bdr.default_apply_delay = 2000
bdr.log_conflicts_to_table = on
bdr.connections = 'bdrnode3,bdrnode3'
bdr.bdrnode2_dsn = 'dbname=db user=postgres host=host2 port=5599'
bdr.bdrnode3_dsn = 'dbname=db user=postgres host=host3 port=5599'
Node 2:
max_replication_slots = 4
max_wal_senders = 6
wal_level = 'logical'
track_commit_timestamp = on
shared_preload_libraries = 'bdr'
max_worker_processes = 10
log_error_verbosity = verbose
log_min_messages = debug1
log_line_prefix = 'd=%d p=%p a=%a%q '
bdr.default_apply_delay = 2000
bdr.log_conflicts_to_table = on
bdr.connections = 'bdrnode1,bdrnode3'
bdr.bdrnode1_dsn = 'dbname=db user=postgres host=host1 port=5599'
bdr.bdrnode3_dsn = 'dbname=db user=postgres host=host3 port=5599'
bdr.bdrnode1_init_replica = on
bdr.bdrnode1_replica_local_dsn = 'dbname=db user=postgres host=localhost port=5599'
Node 3:
max_replication_slots = 4
max_wal_senders = 6
wal_level = 'logical'
track_commit_timestamp = on
shared_preload_libraries = 'bdr'
max_worker_processes = 10
log_error_verbosity = verbose
log_min_messages = debug1
log_line_prefix = 'd=%d p=%p a=%a%q '
bdr.default_apply_delay = 2000
bdr.log_conflicts_to_table = on
bdr.connections = 'bdrnode1,bdrnode2'
bdr.bdrnode1_dsn = 'dbname=db user=postgres host=host1 port=5599'
bdr.bdrnode2_dsn = 'dbname=db user=postgres host=host2 port=5599'
bdr.bdrnode1_init_replica = on
bdr.bdrnode1_replica_local_dsn = 'dbname=db user=postgres host=localhost port=5599'
The above setup is an attempt for a "3-remote site simple Multi-Master Plex" configuration.
Node1 and Node2 are replicating, but Node3 does not work, and the following data is is from Node3 logs:
d=db_name p=3173 a=bdr (6106457559585933042,1,16398,): receive CONTEXT: slot "bdr_16396_6106457559585933042_1_16398__", output plugin "bdr", in the startup callback
d=db_name p=3173 a=bdr (6106457559585933042,1,16398,): receive LOCATION: pg_decode_startup, bdr_output.c:450
d=db_name p=3173 a=bdr (6106457559585933042,1,16398,): receive ERROR: 55000: bdr output plugin: slot creation rejected, bdr.bdr_nodes entry for local node (sysid=6107134089288229725, timelineid=1, dboid=16396): status='i', bdr still starting up: applying initial dump of remote node
d=db_name p=3173 a=bdr (6106457559585933042,1,16398,): receive HINT: Monitor pg_stat_activity and the logs, wait until the node has caught up
d=db_name p=3173 a=bdr (6106457559585933042,1,16398,): receive CONTEXT: slot "bdr_16396_6106457559585933042_1_16398__", output plugin "bdr", in the startup callback
d=db_name p=3173 a=bdr (6106457559585933042,1,16398,): receive LOCATION: bdr_ensure_node_ready, bdr_output.c:268
d=db_name p=3173 a=bdr (6106457559585933042,1,16398,): receive LOG: 08006: could not receive data from client: Connection reset by peer
d=db_name p=3173 a=bdr (6106457559585933042,1,16398,): receive LOCATION: pq_recvbuf, pqcomm.c:869
d=db_name p=3173 a=bdr (6106457559585933042,1,16398,): receive DEBUG: 08003: unexpected EOF on client connection
d=db_name p=3173 a=bdr (6106457559585933042,1,16398,): receive LOCATION: SocketBackend, postgres.c:353
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive DEBUG: 00000: received replication command: IDENTIFY_SYSTEM
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive LOCATION: exec_replication_command, walsender.c:1292
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive DEBUG: 00000: received replication command: START_REPLICATION SLOT "bdr_16396_6106458869483394081_1_16396__" LOGICAL 0/0 (pg_version '90400', pg_catversion '201408161', bdr_version '701', min_bdr_version '700', sizeof_int '4', sizeof_long '8', sizeof_datum '8', maxalign '8', float4_byval '1', float8_byval '1', integer_datetimes '1', bigendian '0', db_encoding 'UTF8')
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive LOCATION: exec_replication_command, walsender.c:1292
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive DEBUG: 00000: bdr.bdr_conflict_handlers OID set to 16991
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive CONTEXT: slot "bdr_16396_6106458869483394081_1_16396__", output plugin "bdr", in the startup callback
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive LOCATION: pg_decode_startup, bdr_output.c:450
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive ERROR: 55000: bdr output plugin: slot creation rejected, bdr.bdr_nodes entry for local node (sysid=6107134089288229725, timelineid=1, dboid=16396): status='i', bdr still starting up: applying initial dump of remote node
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive HINT: Monitor pg_stat_activity and the logs, wait until the node has caught up
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive CONTEXT: slot "bdr_16396_6106458869483394081_1_16396__", output plugin "bdr", in the startup callback
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive LOCATION: bdr_ensure_node_ready, bdr_output.c:268
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive LOG: 08006: could not receive data from client: Connection reset by peer
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive LOCATION: pq_recvbuf, pqcomm.c:869
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive DEBUG: 08003: unexpected EOF on client connection
d=db_name p=3175 a=bdr (6106458869483394081,1,16396,): receive LOCATION: SocketBackend, postgres.c:353
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive DEBUG: 00000: received replication command: IDENTIFY_SYSTEM
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive LOCATION: exec_replication_command, walsender.c:1292
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive DEBUG: 00000: received replication command: START_REPLICATION SLOT "bdr_16396_6106457559585933042_1_16398__" LOGICAL 0/0 (pg_version '90400', pg_catversion '201408161', bdr_version '701', min_bdr_version '700', sizeof_int '4', sizeof_long '8', sizeof_datum '8', maxalign '8', float4_byval '1', float8_byval '1', integer_datetimes '1', bigendian '0', db_encoding 'UTF8')
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive LOCATION: exec_replication_command, walsender.c:1292
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive DEBUG: 00000: bdr.bdr_conflict_handlers OID set to 16991
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive CONTEXT: slot "bdr_16396_6106457559585933042_1_16398__", output plugin "bdr", in the startup callback
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive LOCATION: pg_decode_startup, bdr_output.c:450
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive ERROR: 55000: bdr output plugin: slot creation rejected, bdr.bdr_nodes entry for local node (sysid=6107134089288229725, timelineid=1, dboid=16396): status='i', bdr still starting up: applying initial dump of remote node
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive HINT: Monitor pg_stat_activity and the logs, wait until the node has caught up
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive CONTEXT: slot "bdr_16396_6106457559585933042_1_16398__", output plugin "bdr", in the startup callback
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive LOCATION: bdr_ensure_node_ready, bdr_output.c:268
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive LOG: 08006: could not receive data from client: Connection reset by peer
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive LOCATION: pq_recvbuf, pqcomm.c:869
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive DEBUG: 08003: unexpected EOF on client connection
d=db_name p=3189 a=bdr (6106457559585933042,1,16398,): receive LOCATION: SocketBackend, postgres.c:353
d= p=3190 a=DEBUG: 00000: autovacuum: processing database "db_name"
d= p=3190 a=LOCATION: AutoVacWorkerMain, autovacuum.c:1670
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive DEBUG: 00000: received replication command: IDENTIFY_SYSTEM
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive LOCATION: exec_replication_command, walsender.c:1292
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive DEBUG: 00000: received replication command: START_REPLICATION SLOT "bdr_16396_6106458869483394081_1_16396__" LOGICAL 0/0 (pg_version '90400', pg_catversion '201408161', bdr_version '701', min_bdr_version '700', sizeof_int '4', sizeof_long '8', sizeof_datum '8', maxalign '8', float4_byval '1', float8_byval '1', integer_datetimes '1', bigendian '0', db_encoding 'UTF8')
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive LOCATION: exec_replication_command, walsender.c:1292
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive DEBUG: 00000: bdr.bdr_conflict_handlers OID set to 16991
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive CONTEXT: slot "bdr_16396_6106458869483394081_1_16396__", output plugin "bdr", in the startup callback
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive LOCATION: pg_decode_startup, bdr_output.c:450
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive ERROR: 55000: bdr output plugin: slot creation rejected, bdr.bdr_nodes entry for local node (sysid=6107134089288229725, timelineid=1, dboid=16396): status='i', bdr still starting up: applying initial dump of remote node
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive HINT: Monitor pg_stat_activity and the logs, wait until the node has caught up
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive CONTEXT: slot "bdr_16396_6106458869483394081_1_16396__", output plugin "bdr", in the startup callback
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive LOCATION: bdr_ensure_node_ready, bdr_output.c:268
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive LOG: 08006: could not receive data from client: Connection reset by peer
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive LOCATION: pq_recvbuf, pqcomm.c:869
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive DEBUG: 08003: unexpected EOF on client connection
d=db_name p=3191 a=bdr (6106458869483394081,1,16396,): receive LOCATION: SocketBackend, postgres.c:353
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive DEBUG: 00000: received replication command: IDENTIFY_SYSTEM
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive LOCATION: exec_replication_command, walsender.c:1292
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive DEBUG: 00000: received replication command: START_REPLICATION SLOT "bdr_16396_6106457559585933042_1_16398__" LOGICAL 0/0 (pg_version '90400', pg_catversion '201408161', bdr_version '701', min_bdr_version '700', sizeof_int '4', sizeof_long '8', sizeof_datum '8', maxalign '8', float4_byval '1', float8_byval '1', integer_datetimes '1', bigendian '0', db_encoding 'UTF8')
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive LOCATION: exec_replication_command, walsender.c:1292
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive DEBUG: 00000: bdr.bdr_conflict_handlers OID set to 16991
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive CONTEXT: slot "bdr_16396_6106457559585933042_1_16398__", output plugin "bdr", in the startup callback
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive LOCATION: pg_decode_startup, bdr_output.c:450
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive ERROR: 55000: bdr output plugin: slot creation rejected, bdr.bdr_nodes entry for local node (sysid=6107134089288229725, timelineid=1, dboid=16396): status='i', bdr still starting up: applying initial dump of remote node
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive HINT: Monitor pg_stat_activity and the logs, wait until the node has caught up
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive CONTEXT: slot "bdr_16396_6106457559585933042_1_16398__", output plugin "bdr", in the startup callback
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive LOCATION: bdr_ensure_node_ready, bdr_output.c:268
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive LOG: 08006: could not receive data from client: Connection reset by peer
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive LOCATION: pq_recvbuf, pqcomm.c:869
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive DEBUG: 08003: unexpected EOF on client connection
d=db_name p=3199 a=bdr (6106457559585933042,1,16398,): receive LOCATION: SocketBackend, postgres.c:353
The persisting error is: 55000: bdr output plugin: slot creation rejected, bdr.bdr_nodes entry for local node (sysid=6107134089288229725, timelineid=1, dboid=16396): status='i', bdr still starting up: applying initial dump of remote node
I hope I have provided enough detail and that I am posting at the right forum.
Regards
View this message in context: Re: BDR Error restarted
Sent from the PostgreSQL - general mailing list archive at Nabble.com.