Hello
my topology/environment is :
test linux , pgsql 10.13 (subscription) <--- prod linux, pgsql 10.13 (publication)
In order to connect to our accounting system (ms sql) , I needed
to do cross-db queries, so I setup freetds (1.1.42) and
tds_fdw-2.0.1 and tested them on the test system (subscription
node). This involved the IMPORT of some 1000 ms sql tables in a
dedicated schema, plus a few others in a second dedicated schema.
All went fine, most tests worked nice, so I thought of installing
tds_fdw in the prod host. Needless to say, the new mssql dedicated
schemas did not involve any logical replication . After importing
the 1000 tables in prod with,
IMPORT FOREIGN SCHEMA dbo EXCEPT ("....") FROM SERVER
mssql_bdynacom_srv INTO mssql_bdynacom OPTIONS (import_default
'true');
after 15 mins (wal_sender_timeout / wal_receiver_timeout) I started getting in the test node :
[120394] 5ee085dc.1d64a 2020-06-16 17:10:26.450 EEST @ line:5
ERROR: terminating logical replication worker due to timeout
[87028] 5ee8d2d2.153f4 2020-06-16 17:10:26.467 EEST @ line:1
LOG: logical replication apply worker for subscription
"data_for_testsmadb_pub" has started
[87028] 5ee8d2d2.153f4 2020-06-16 17:10:26.545 EEST @ line:2
ERROR: could not start WAL streaming: ERROR: replication slot
"data_for_testsmadb_pub" is active for PID 10991
.....
in the meantime the "wal sender" on the prod node is idle. And
seems stuck. Apparently the addition of those 1000 foreign tables
messed up with the replication slot. I tried to move forward the
replication slot, but even right after restarting the two systems
, I can't seem to find any handle to call : pg_replication_origin_advance
dynacom=#
select * from pg_replication_origin;
roident | roname
---------+--------
(0 rows)
dynacom=# select * from pg_replication_origin_status ;
local_id | external_id | remote_lsn | local_lsn
----------+-------------+------------+-----------
(0 rows)
dynacom=#
So, I am pretty much at loss here . Go the guarantied way : drop
the two mssql schemas (i.e. revert to pre-incident state) and
re-doing the whole subscription node initialization and setup from
scratch seems like a possible scenario but I'd like to avoid that,
logical replication works fine for two years, and it will take
maybe days to catch up, I am not at the office, I work from home
and this makes things worse. Another option is to leave the
tds_fdw and the foreign table definitions and re-do the logical
replication setup. Please help!