Hi,
I have run into a (to me) weird
issue with logical replication. We are running Zalandos
postgres-operator in our Kubernetes clusters and have recently had a
use-case where we wanted to start doing logical replication of select
tables to a data warehouse, also running postgres. It worked as expected
at first, but then after a pod-restart in Kubernetes, the replication
slots that were created for the subscription were gone. A bit of reading
later, and I learn we need to tell Patroni which slots should be
permanently available, so we specify a slot and try to set this up, but
then run into an error which says the publication does not exist, even
though we can verify that it does. At first I suspected Patroni handling
the replication slots to be the cause of the problem, but about a
week's worth of learning and experimenting later, I can now reliably
replicate the problem in pure postgres. Patroni is kind of the catalyst,
since my findings are that if the replication slot is created before
data is inserted into the source database, and a publication is created,
then it breaks. If the replication slot is created after data is
inserted and the publication is created, then it works. We just can't
tell Patroni to not create it until some arbitrary point in time. I am
guessing this is either a bug or a case of us not knowing what we are
doing...
I have created a Github gist demonstrating the problem: https://gist.github.com/drzero42/02b4082ce002c1d90ddd64f5fe03aee0
Anybody able to help? :)
--
Anders Bøgh Bruun
Infrastructure Architect
CellPoint
digital
cellpointdigital.com
WE
MAKE TRAVEL EASIER™
M: +45 31 14 87 41
Chicago | Copenhagen | Dubai | London | Miami | Pune | Singapore