Search Postgresql Archives

BDR not catching up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm getting this message repeating on the UDR node that I just added today. 
Any way to get it start applying?
svp2=# select * from bdr.bdr_nodes;
     node_sysid      | node_timeline | node_dboid | node_status |   
node_name    |            node_local_dsn             |           
node_init_from_dsn      
       
---------------------+---------------+------------+-------------+-----------------+---------------------------------------+------------------------------------
-------
 6206439726032130602 |             1 |      16385 | r           | UDR1           
|                                       | 
 6260914790689848233 |             1 |      16385 | c           |
UDR1-subscriber | host=10.253.0.8 port=5432 dbname=svp2 |
host=10.253.228.105 port=5432 dbnam
e=svp2
(2 rows)



t=2016-03-11 15:23:51 PST d= h= p=7226 a=DEBUG:  00000: per-db worker for
node bdr (6260914790689848233,1,16385,) starting
t=2016-03-11 15:23:51 PST d= h= p=7226 a=LOCATION:  bdr_perdb_worker_main,
bdr_perdb.c:707
t=2016-03-11 15:23:51 PST d= h= p=7226 a=DEBUG:  00000: init_replica init
from remote host=10.253.228.105 port=5432 dbname=svp2
t=2016-03-11 15:23:51 PST d= h= p=7226 a=LOCATION:  bdr_init_replica,
bdr_init_replica.c:830
t=2016-03-11 15:23:51 PST d= h= p=7226 a=DEBUG:  00000: found valid
replication identifier 1
t=2016-03-11 15:23:51 PST d= h= p=7226 a=LOCATION: 
bdr_establish_connection_and_slot, bdr.c:572
t=2016-03-11 15:23:51 PST d= h= p=7226 a=DEBUG:  00000: launching catchup
mode apply worker
t=2016-03-11 15:23:51 PST d= h= p=7226 a=LOCATION:  bdr_init_replica,
bdr_init_replica.c:1043
t=2016-03-11 15:23:51 PST d= h= p=7226 a=DEBUG:  00000: Registering bdr
apply catchup worker for bdr (6206439726032130602,1,16385,) to lsn
19E/10AC4F0
t=2016-03-11 15:23:51 PST d= h= p=7226 a=LOCATION:  bdr_catchup_to_lsn,
bdr_init_replica.c:1161
t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOG:  00000: registering background
worker "bdr: catchup apply to 19E/10AC4F0"
t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOCATION: 
BackgroundWorkerStateChange, bgworker.c:347
t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOG:  00000: starting background
worker process "bdr: catchup apply to 19E/10AC4F0"
t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOCATION:  do_start_bgworker,
postmaster.c:5412
t=2016-03-11 15:23:51 PST d= h= p=7227 a=NOTICE:  00000: version "1.0" of
extension "btree_gist" is already installed
t=2016-03-11 15:23:51 PST d= h= p=7227 a=LOCATION:  ExecAlterExtensionStmt,
extension.c:2700
t=2016-03-11 15:23:51 PST d= h= p=7227 a=NOTICE:  00000: version "0.9.2.0"
of extension "bdr" is already installed
t=2016-03-11 15:23:51 PST d= h= p=7227 a=LOCATION:  ExecAlterExtensionStmt,
extension.c:2700
t=2016-03-11 15:23:51 PST d= h= p=7227 a=NOTICE:  42622: identifier "bdr
(6260914790689848233,1,16385,): apply catchup up to 19E/10AC4F0" will be
truncated to "bdr (6260914790689848233,1,16385,): apply catchup up to
19E/10A"
t=2016-03-11 15:23:51 PST d= h= p=7227 a=LOCATION:  truncate_identifier,
scansup.c:195
t=2016-03-11 15:23:51 PST d= h= p=7227 a=DEBUG:  00000: found valid
replication identifier 1
t=2016-03-11 15:23:51 PST d= h= p=7227 a=LOCATION: 
bdr_establish_connection_and_slot, bdr.c:572
t=2016-03-11 15:23:51 PST d= h= p=7227 a=INFO:  00000: starting up
replication from 1 at 19D/D204D0C8
t=2016-03-11 15:23:51 PST d= h= p=7227 a=LOCATION:  bdr_apply_main,
bdr_apply.c:2550
t=2016-03-11 15:23:51 PST d= h= p=7227 a=DEBUG:  00000: bdr_apply: BEGIN
origin(source, orig_lsn, timestamp): 19D/D204D3A0, 2016-03-11
13:49:47.293208-08
t=2016-03-11 15:23:51 PST d= h= p=7227 a=LOCATION:  process_remote_begin,
bdr_apply.c:198
t=2016-03-11 15:23:51 PST d= h= p=7227 a=ERROR:  XX000: tuple natts
mismatch, 26 vs 28
t=2016-03-11 15:23:51 PST d= h= p=7227 a=LOCATION:  read_tuple_parts,
bdr_apply.c:1892
t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOG:  00000: worker process: bdr:
catchup apply to 19E/10AC4F0 (PID 7227) exited with exit code 1
t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOCATION:  LogChildExit,
postmaster.c:3325
t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOG:  00000: unregistering
background worker "bdr: catchup apply to 19E/10AC4F0"
t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOCATION:  ForgetBackgroundWorker,
bgworker.c:376
t=2016-03-11 15:23:52 PST d= h= p=7226 a=ERROR:  XX000: catchup worker
exited before catching up to target LSN 19E/10AC4F0
t=2016-03-11 15:23:52 PST d= h= p=7226 a=LOCATION:  bdr_catchup_to_lsn,
bdr_init_replica.c:1273
t=2016-03-11 15:23:52 PST d= h= p=4718 a=LOG:  00000: worker process: bdr
db: svp2 (PID 7226) exited with exit code 1
t=2016-03-11 15:23:52 PST d= h= p=4718 a=LOCATION:  LogChildExit,
postmaster.c:3325
t=2016-03-11 15:23:54 PST d= h= p=7228 a=DEBUG:  00000: autovacuum:
processing database "bdr_supervisordb"
t=2016-03-11 15:23:54 PST d= h= p=7228 a=LOCATION:  AutoVacWorkerMain,
autovacuum.c:1684
t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOG:  00000: starting background
worker process "bdr db: svp2"
t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOCATION:  do_start_bgworker,
postmaster.c:5412
t=2016-03-11 15:23:57 PST d= h= p=7229 a=NOTICE:  00000: version "1.0" of
extension "btree_gist" is already installed
t=2016-03-11 15:23:57 PST d= h= p=7229 a=LOCATION:  ExecAlterExtensionStmt,
extension.c:2700
t=2016-03-11 15:23:57 PST d= h= p=7229 a=NOTICE:  00000: version "0.9.2.0"
of extension "bdr" is already installed
t=2016-03-11 15:23:57 PST d= h= p=7229 a=LOCATION:  ExecAlterExtensionStmt,
extension.c:2700
t=2016-03-11 15:23:57 PST d= h= p=7229 a=DEBUG:  00000: per-db worker for
node bdr (6260914790689848233,1,16385,) starting
t=2016-03-11 15:23:57 PST d= h= p=7229 a=LOCATION:  bdr_perdb_worker_main,
bdr_perdb.c:707
t=2016-03-11 15:23:57 PST d= h= p=7229 a=DEBUG:  00000: init_replica init
from remote host=10.253.228.105 port=5432 dbname=svp2
t=2016-03-11 15:23:57 PST d= h= p=7229 a=LOCATION:  bdr_init_replica,
bdr_init_replica.c:830
t=2016-03-11 15:23:57 PST d= h= p=7229 a=DEBUG:  00000: found valid
replication identifier 1
t=2016-03-11 15:23:57 PST d= h= p=7229 a=LOCATION: 
bdr_establish_connection_and_slot, bdr.c:572
t=2016-03-11 15:23:57 PST d= h= p=7229 a=DEBUG:  00000: launching catchup
mode apply worker
t=2016-03-11 15:23:57 PST d= h= p=7229 a=LOCATION:  bdr_init_replica,
bdr_init_replica.c:1043
t=2016-03-11 15:23:57 PST d= h= p=7229 a=DEBUG:  00000: Registering bdr
apply catchup worker for bdr (6206439726032130602,1,16385,) to lsn
19E/10BA488
t=2016-03-11 15:23:57 PST d= h= p=7229 a=LOCATION:  bdr_catchup_to_lsn,
bdr_init_replica.c:1161
t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOG:  00000: registering background
worker "bdr: catchup apply to 19E/10BA488"
t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOCATION: 
BackgroundWorkerStateChange, bgworker.c:347
t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOG:  00000: starting background
worker process "bdr: catchup apply to 19E/10BA488"
t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOCATION:  do_start_bgworker,
postmaster.c:5412
t=2016-03-11 15:23:57 PST d= h= p=7230 a=NOTICE:  00000: version "1.0" of
extension "btree_gist" is already installed
t=2016-03-11 15:23:57 PST d= h= p=7230 a=LOCATION:  ExecAlterExtensionStmt,
extension.c:2700
t=2016-03-11 15:23:57 PST d= h= p=7230 a=NOTICE:  00000: version "0.9.2.0"
of extension "bdr" is already installed
t=2016-03-11 15:23:57 PST d= h= p=7230 a=LOCATION:  ExecAlterExtensionStmt,
extension.c:2700
t=2016-03-11 15:23:57 PST d= h= p=7230 a=NOTICE:  42622: identifier "bdr
(6260914790689848233,1,16385,): apply catchup up to 19E/10BA488" will be
truncated to "bdr (6260914790689848233,1,16385,): apply catchup up to
19E/10B"
t=2016-03-11 15:23:57 PST d= h= p=7230 a=LOCATION:  truncate_identifier,
scansup.c:195
t=2016-03-11 15:23:57 PST d= h= p=7230 a=DEBUG:  00000: found valid
replication identifier 1
t=2016-03-11 15:23:57 PST d= h= p=7230 a=LOCATION: 
bdr_establish_connection_and_slot, bdr.c:572
t=2016-03-11 15:23:57 PST d= h= p=7230 a=INFO:  00000: starting up
replication from 1 at 19D/D204D0C8
t=2016-03-11 15:23:57 PST d= h= p=7230 a=LOCATION:  bdr_apply_main,
bdr_apply.c:2550
t=2016-03-11 15:23:57 PST d= h= p=7230 a=DEBUG:  00000: bdr_apply: BEGIN
origin(source, orig_lsn, timestamp): 19D/D204D3A0, 2016-03-11
13:49:47.293208-08
t=2016-03-11 15:23:57 PST d= h= p=7230 a=LOCATION:  process_remote_begin,
bdr_apply.c:198
t=2016-03-11 15:23:57 PST d= h= p=7230 a=ERROR:  XX000: tuple natts
mismatch, 26 vs 28
t=2016-03-11 15:23:57 PST d= h= p=7230 a=LOCATION:  read_tuple_parts,
bdr_apply.c:1892
t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOG:  00000: worker process: bdr:
catchup apply to 19E/10BA488 (PID 7230) exited with exit code 1
t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOCATION:  LogChildExit,
postmaster.c:3325
t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOG:  00000: unregistering
background worker "bdr: catchup apply to 19E/10BA488"
t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOCATION:  ForgetBackgroundWorker,
bgworker.c:376
t=2016-03-11 15:23:58 PST d= h= p=7229 a=ERROR:  XX000: catchup worker
exited before catching up to target LSN 19E/10BA488
t=2016-03-11 15:23:58 PST d= h= p=7229 a=LOCATION:  bdr_catchup_to_lsn,
bdr_init_replica.c:1273
t=2016-03-11 15:23:58 PST d= h= p=4718 a=LOG:  00000: worker process: bdr
db: svp2 (PID 7229) exited with exit code 1
t=2016-03-11 15:23:58 PST d= h= p=4718 a=LOCATION:  LogChildExit,
postmaster.c:3325
t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOG:  00000: starting background
worker process "bdr db: svp2"
t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOCATION:  do_start_bgworker,
postmaster.c:5412
t=2016-03-11 15:24:03 PST d= h= p=7231 a=NOTICE:  00000: version "1.0" of
extension "btree_gist" is already installed
t=2016-03-11 15:24:03 PST d= h= p=7231 a=LOCATION:  ExecAlterExtensionStmt,
extension.c:2700
t=2016-03-11 15:24:03 PST d= h= p=7231 a=NOTICE:  00000: version "0.9.2.0"
of extension "bdr" is already installed
t=2016-03-11 15:24:03 PST d= h= p=7231 a=LOCATION:  ExecAlterExtensionStmt,
extension.c:2700
t=2016-03-11 15:24:03 PST d= h= p=7231 a=DEBUG:  00000: per-db worker for
node bdr (6260914790689848233,1,16385,) starting
t=2016-03-11 15:24:03 PST d= h= p=7231 a=LOCATION:  bdr_perdb_worker_main,
bdr_perdb.c:707
t=2016-03-11 15:24:03 PST d= h= p=7231 a=DEBUG:  00000: init_replica init
from remote host=10.253.228.105 port=5432 dbname=svp2
t=2016-03-11 15:24:03 PST d= h= p=7231 a=LOCATION:  bdr_init_replica,
bdr_init_replica.c:830
t=2016-03-11 15:24:03 PST d= h= p=7231 a=DEBUG:  00000: found valid
replication identifier 1
t=2016-03-11 15:24:03 PST d= h= p=7231 a=LOCATION: 
bdr_establish_connection_and_slot, bdr.c:572
t=2016-03-11 15:24:03 PST d= h= p=7231 a=DEBUG:  00000: launching catchup
mode apply worker
t=2016-03-11 15:24:03 PST d= h= p=7231 a=LOCATION:  bdr_init_replica,
bdr_init_replica.c:1043
t=2016-03-11 15:24:03 PST d= h= p=7231 a=DEBUG:  00000: Registering bdr
apply catchup worker for bdr (6206439726032130602,1,16385,) to lsn
19E/10E9D58
t=2016-03-11 15:24:03 PST d= h= p=7231 a=LOCATION:  bdr_catchup_to_lsn,
bdr_init_replica.c:1161
t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOG:  00000: registering background
worker "bdr: catchup apply to 19E/10E9D58"
t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOCATION: 
BackgroundWorkerStateChange, bgworker.c:347
t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOG:  00000: starting background
worker process "bdr: catchup apply to 19E/10E9D58"
t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOCATION:  do_start_bgworker,
postmaster.c:5412
t=2016-03-11 15:24:03 PST d= h= p=7232 a=NOTICE:  00000: version "1.0" of
extension "btree_gist" is already installed
t=2016-03-11 15:24:03 PST d= h= p=7232 a=LOCATION:  ExecAlterExtensionStmt,
extension.c:2700
t=2016-03-11 15:24:03 PST d= h= p=7232 a=NOTICE:  00000: version "0.9.2.0"
of extension "bdr" is already installed
t=2016-03-11 15:24:03 PST d= h= p=7232 a=LOCATION:  ExecAlterExtensionStmt,
extension.c:2700
t=2016-03-11 15:24:03 PST d= h= p=7232 a=NOTICE:  42622: identifier "bdr
(6260914790689848233,1,16385,): apply catchup up to 19E/10E9D58" will be
truncated to "bdr (6260914790689848233,1,16385,): apply catchup up to
19E/10E"
t=2016-03-11 15:24:03 PST d= h= p=7232 a=LOCATION:  truncate_identifier,
scansup.c:195
t=2016-03-11 15:24:03 PST d= h= p=7232 a=DEBUG:  00000: found valid
replication identifier 1
t=2016-03-11 15:24:03 PST d= h= p=7232 a=LOCATION: 
bdr_establish_connection_and_slot, bdr.c:572
t=2016-03-11 15:24:03 PST d= h= p=7232 a=INFO:  00000: starting up
replication from 1 at 19D/D204D0C8
t=2016-03-11 15:24:03 PST d= h= p=7232 a=LOCATION:  bdr_apply_main,
bdr_apply.c:2550
t=2016-03-11 15:24:03 PST d= h= p=7232 a=DEBUG:  00000: bdr_apply: BEGIN
origin(source, orig_lsn, timestamp): 19D/D204D3A0, 2016-03-11
13:49:47.293208-08
t=2016-03-11 15:24:03 PST d= h= p=7232 a=LOCATION:  process_remote_begin,
bdr_apply.c:198
t=2016-03-11 15:24:03 PST d= h= p=7232 a=ERROR:  XX000: tuple natts
mismatch, 26 vs 28
t=2016-03-11 15:24:03 PST d= h= p=7232 a=LOCATION:  read_tuple_parts,
bdr_apply.c:1892
t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOG:  00000: worker process: bdr:
catchup apply to 19E/10E9D58 (PID 7232) exited with exit code 1
t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOCATION:  LogChildExit,
postmaster.c:3325
t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOG:  00000: unregistering
background worker "bdr: catchup apply to 19E/10E9D58"
t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOCATION:  ForgetBackgroundWorker,
bgworker.c:376
t=2016-03-11 15:24:04 PST d= h= p=7231 a=ERROR:  XX000: catchup worker
exited before catching up to target LSN 19E/10E9D58
t=2016-03-11 15:24:04 PST d= h= p=7231 a=LOCATION:  bdr_catchup_to_lsn,
bdr_init_replica.c:1273
t=2016-03-11 15:24:04 PST d= h= p=4718 a=LOG:  00000: worker process: bdr
db: svp2 (PID 7231) exited with exit code 1
t=2016-03-11 15:24:04 PST d= h= p=4718 a=LOCATION:  LogChildExit,
postmaster.c:3325

This is from the primary node:

svp2=# SELECT
      slot_name, database, active,
      pg_xlog_location_diff(pg_current_xlog_insert_location(), restart_lsn)
AS retained_bytes
    FROM pg_replication_slots
    WHERE plugin = 'bdr';
                slot_name                | database | active |
retained_bytes 
-----------------------------------------+----------+--------+----------------
 bdr_16385_6260914790689848233_1_16385__ | svp2     | f      |     
687816472
(1 row)


And this same scenario happens every time I try to add a new node.

Thank you,

Carter 




--
View this message in context: http://postgresql.nabble.com/BDR-not-catching-up-tp5892335.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux