It makes sense now why that happened and what to do in case of emergency
At Mon, 08 Feb 2021 22:42:21 +0700, Игорь Выскорко <vyskorko.igor@xxxxxxxxx> wrote in
Hi, community!
Unfortunately can't find answer in docs and google. Hope only for you)
[local]:5433 postgres@postgres=# drop publication pub ;
DROP PUBLICATION
Time: 3,793 ms
[local]:5433 postgres@postgres=# insert into tbl(d) values ('test2');
INSERT 0 1
Time: 9,002 ms
[local]:5433 postgres@postgres=# create publication pub for table tbl;
CREATE PUBLICATION
Time: 6,646 ms
result: nothing changed, same errors appears again and again. I couldn't find
how to restore replication without drop&create subscription again.
If you recreated the publication before the insert, replication would
continue.
Questions here:
1. what is going under the hood here - why walsender thinks that "publication
"pub" does not exist" when it actually exists?
The answer is "because the publication did not exist at the time of
the INSERT". Thus the insert cannot be replicated using the new
publication.
It is because logical replication tries to find publications using the
same snapshot with the WAL record to be sent. Although it is the
designed behavior, I'm not sure that is true also for pg_publication.
2. what is the right way to restore replication in my example?--
The most conservative way is just to drop the subscription then delete
all rows from the subscriber table then recreate the
subscription. This allows the newly created publication to work.
Also you can drop the subscription, then manually fix the subscriber
table to sync with the publisher table, then create a new subscription
using WITH (copy_data = false);
regards.
Kyotaro Horiguchi
NTT Open Source Software Center